scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Signed and unsigned reward prediction errors dynamically enhance learning and memory

04 Mar 2021-eLife (eLife Sciences Publications Limited)-Vol. 10
TL;DR: It is found that both signed and unsigned RPEs enhance memory, in line with midbrain dopamine and locus-coeruleus modulation of hippocampal plasticity, thereby reconciling separate findings in the literature.
Abstract: Memory helps guide behavior, but which experiences from the past are prioritized? Classic models of learning posit that events associated with unpredictable outcomes as well as, paradoxically, predictable outcomes, deploy more attention and learning for those events. Here, we test reinforcement learning and subsequent memory for those events, and treat signed and unsigned reward prediction errors (RPEs), experienced at the reward-predictive cue or reward outcome, as drivers of these two seemingly contradictory signals. By fitting reinforcement learning models to behavior, we find that both RPEs contribute to learning by modulating a dynamically changing learning rate. We further characterize the effects of these RPE signals on memory and show that both signed and unsigned RPEs enhance memory, in line with midbrain dopamine and locus-coeruleus modulation of hippocampal plasticity, thereby reconciling separate findings in the literature.
Citations
More filters
Journal ArticleDOI
01 Mar 2022-Neuron
TL;DR: This work makes specific predictions about how cerebellar circuits can work in concert with the basal ganglia to guide different stages of learning and strengthens the emerging consensus that the cerebellum plays a pivotal role in shaping cognitive processing.

19 citations

Journal ArticleDOI
TL;DR: It is shown that activity patterns in human CA3/dentate gyrus exhibit an abrupt, temporally-specific decorrelation of highly similar memory representations that is precisely coupled with behavioral expressions of successful learning, establishing a critical link between hippocampal remapping and episodic memory interference and providing insight into why remapping occurs.
Abstract: Remapping refers to a decorrelation of hippocampal representations of similar spatial environments. While it has been speculated that remapping may contribute to the resolution of episodic memory interference in humans, direct evidence is surprisingly limited. We tested this idea using high-resolution, pattern-based fMRI analyses. Here we show that activity patterns in human CA3/dentate gyrus exhibit an abrupt, temporally-specific decorrelation of highly similar memory representations that is precisely coupled with behavioral expressions of successful learning. The magnitude of this learning-related decorrelation was predicted by the amount of pattern overlap during initial stages of learning, with greater initial overlap leading to stronger decorrelation. Finally, we show that remapped activity patterns carry relatively more information about learned episodic associations compared to competing associations, further validating the learning-related significance of remapping. Collectively, these findings establish a critical link between hippocampal remapping and episodic memory interference and provide insight into why remapping occurs. When two memories are similar, their encoding and retrieval can be disrupted by each other. Here the authors show that memory interference is resolved through abrupt remapping of activity patterns in the human hippocampal CA3 and dentate gyrus.

15 citations

Posted ContentDOI
TL;DR: Enhanced item memory in the absence of gist-based mistakes suggests that violations enhanced memory for items’ details, which could be mediated via distinct memory traces.
Abstract: When our experience violates our predictions, it is adaptive to update our knowledge to promote a more accurate representation of the world and facilitate future predictions. Theoretical models propose that these mnemonic prediction errors should be encoded into a distinct memory trace to prevent interference with previous, conflicting memories. We investigated this proposal by repeatedly exposing participants to pairs of sequentially presented objects (A → B), thus evoking expectations. Then, we violated participants' expectations by replacing the second object in the pairs with a novel object (A → C). The following item memory test required participants to discriminate between identical old items and similar lures, thus testing detailed and distinctive item memory representations. In two experiments, mnemonic prediction errors enhanced item memory: Participants correctly identified more old items as old when those items violated expectations during learning, compared with items that did not violate expectations. This memory enhancement for C items was only observed when participants later showed intact memory for the related A → B pairs, suggesting that strong predictions are required to facilitate memory for violations. Following up on this, a third experiment reduced prediction strength prior to violation and subsequently eliminated the memory advantage of violations. Interestingly, mnemonic prediction errors did not increase gist-based mistakes of identifying old items as similar lures or identifying similar lures as old. Enhanced item memory in the absence of gist-based mistakes suggests that violations enhanced memory for items' details, which could be mediated via distinct memory traces. Together, these results advance our knowledge of how mnemonic prediction errors promote memory formation.

13 citations


Cites background from "Signed and unsigned reward predicti..."

  • ...While reward prediction errors have long been established as promoting learning and decision making, their role in long-term memory has only been appreciated more recently (e.g., Wimmer et al. 2014; Rouhani et al. 2018, 2020; Jang et al. 2019; Ergo et al. 2020; Rouhani and Niv 2021)....

    [...]

  • ...These studies have generally shown memory benefits for stimuli appearing during a reward prediction error (Wimmer et al. 2014; Davidowet al. 2016; De Loof et al. 2018; Rouhani et al. 2018; Jang et al. 2019; Kalbe and Schwabe 2019; Ergo et al. 2020; Rouhani and Niv 2021)....

    [...]

Posted ContentDOI
TL;DR: A model suggests a central role for the dynamics of memory reactivation in determining the influence of different kinds of memory in decisions and proposes that representation-specific dynamics can implement a bottom-up “product of experts” rule that integrates multiple sets of action-outcome predictions weighted on the basis of their uncertainty.
Abstract: Memories affect nearly every aspect of our mental life. They allow us to both resolve uncertainty in the present and to construct plans for the future. Recently, renewed interest in the role memory plays in adaptive behavior has led to new theoretical advances and empirical observations. We review key findings, with particular emphasis on how the retrieval of many kinds of memories affects deliberative action selection. These results are interpreted in a sequential inference framework, in which reinstatements from memory serve as "samples" of potential action outcomes. The resulting model suggests a central role for the dynamics of memory reactivation in determining the influence of different kinds of memory in decisions. We propose that representation-specific dynamics can implement a bottom-up "product of experts" rule that integrates multiple sets of action-outcome predictions weighted based on their uncertainty. We close by reviewing related findings and identifying areas for further research. This article is categorized under: Psychology > Reasoning and Decision Making Neuroscience > Cognition Neuroscience > Computation.

13 citations

Journal ArticleDOI
TL;DR: The authors proposed a taxonomy of surprise definitions and classified them into four conceptual categories based on the quantity they measure: prediction surprise, change point detection surprise, information gain surprise, and confidence-corrected surprise.

11 citations

References
More filters
Journal ArticleDOI
TL;DR: In this article, a model is described in an lmer call by a formula, in this case including both fixed-and random-effects terms, and the formula and data together determine a numerical representation of the model from which the profiled deviance or the profeatured REML criterion can be evaluated as a function of some of model parameters.
Abstract: Maximum likelihood or restricted maximum likelihood (REML) estimates of the parameters in linear mixed-effects models can be determined using the lmer function in the lme4 package for R. As for most model-fitting functions in R, the model is described in an lmer call by a formula, in this case including both fixed- and random-effects terms. The formula and data together determine a numerical representation of the model from which the profiled deviance or the profiled REML criterion can be evaluated as a function of some of the model parameters. The appropriate criterion is optimized, using one of the constrained optimization functions in R, to provide the parameter estimates. We describe the structure of the model, the steps in evaluating the profiled deviance or REML criterion, and the structure of classes or types that represents such a model. Sufficient detail is included to allow specialization of these structures by users who wish to write functions to fit specialized linear mixed models, such as models incorporating pedigrees or smoothing splines, that are not easily expressible in the formula language used by lmer.

50,607 citations


"Signed and unsigned reward predicti..." refers methods in this paper

  • ...We used mixed-effects modeling to test hypotheses throughout the paper (lme4 package in R; Bates et al., 2015)....

    [...]

Journal ArticleDOI
TL;DR: In this paper, the problem of selecting one of a number of models of different dimensions is treated by finding its Bayes solution, and evaluating the leading terms of its asymptotic expansion.
Abstract: The problem of selecting one of a number of models of different dimensions is treated by finding its Bayes solution, and evaluating the leading terms of its asymptotic expansion. These terms are a valid large-sample criterion beyond the Bayesian context, since they do not depend on the a priori distribution.

38,681 citations

Book
01 Jan 1988
TL;DR: This book provides a clear and simple account of the key ideas and algorithms of reinforcement learning, which ranges from the history of the field's intellectual foundations to the most recent developments and applications.
Abstract: Reinforcement learning, one of the most active research areas in artificial intelligence, is a computational approach to learning whereby an agent tries to maximize the total amount of reward it receives when interacting with a complex, uncertain environment. In Reinforcement Learning, Richard Sutton and Andrew Barto provide a clear and simple account of the key ideas and algorithms of reinforcement learning. Their discussion ranges from the history of the field's intellectual foundations to the most recent developments and applications. The only necessary mathematical background is familiarity with elementary concepts of probability. The book is divided into three parts. Part I defines the reinforcement learning problem in terms of Markov decision processes. Part II provides basic solution methods: dynamic programming, Monte Carlo methods, and temporal-difference learning. Part III presents a unified view of the solution methods and incorporates artificial neural networks, eligibility traces, and planning; the two final chapters present case studies and consider the future of reinforcement learning.

37,989 citations


"Signed and unsigned reward predicti..." refers methods in this paper

  • ...…their at) once they’ve learned the average values of the reward categories, we tested a model with exponential decay of the learning rate over time (Sutton and Barto, 1998; model: ‘RW-D’): at ¼ hþN e ltc ; (5) where N is the initial value, l is the decay constant, and tc is the trial number for…...

    [...]

01 Jan 2005
TL;DR: In this paper, the problem of selecting one of a number of models of different dimensions is treated by finding its Bayes solution, and evaluating the leading terms of its asymptotic expansion.
Abstract: The problem of selecting one of a number of models of different dimensions is treated by finding its Bayes solution, and evaluating the leading terms of its asymptotic expansion. These terms are a valid large-sample criterion beyond the Bayesian context, since they do not depend on the a priori distribution.

36,760 citations


"Signed and unsigned reward predicti..." refers methods in this paper

  • ...We then compared model recovery using the conservative Bayesian information criterion (BIC; Schwarz, 1978), to calculate a confusion matrix demonstrating the proportion of Rouhani and Niv. eLife 2021;10:e61077....

    [...]

Journal ArticleDOI
14 Mar 1997-Science
TL;DR: Findings in this work indicate that dopaminergic neurons in the primate whose fluctuating output apparently signals changes or errors in the predictions of future salient and rewarding events can be understood through quantitative theories of adaptive optimizing control.
Abstract: The capacity to predict future events permits a creature to detect, model, and manipulate the causal structure of its interactions with its environment. Behavioral experiments suggest that learning is driven by changes in the expectations about future salient events such as rewards and punishments. Physiological work has recently complemented these studies by identifying dopaminergic neurons in the primate whose fluctuating output apparently signals changes or errors in the predictions of future salient and rewarding events. Taken together, these findings can be understood through quantitative theories of adaptive optimizing control.

8,163 citations


"Signed and unsigned reward predicti..." refers background in this paper

  • ...Experiment 2, which included RPEs at cue, allowed us to test whether a putative (signed) dopaminergic RPE, which moves from reward outcome to the cue predicting reward over learning (Barto, 1995; Montague et al., 1996; Schultz et al., 1997), enhances memory for cue events....

    [...]

  • ...Over the course of learning, this dopaminergic RPE transfers from unpredictable reward outcome to the cue predicting the reward (Schultz et al., 1997)....

    [...]

  • ...As mentioned, one dominant hypothesis is that dopaminergic midbrain signals convey signed RPEs to target areas (Barto, 1995; Montague et al., 1996; Schultz et al., 1997)....

    [...]