scispace - formally typeset
Open accessJournal ArticleDOI: 10.7554/ELIFE.61077

Signed and unsigned reward prediction errors dynamically enhance learning and memory

04 Mar 2021-eLife (eLife Sciences Publications Limited)-Vol. 10
Abstract: Memory helps guide behavior, but which experiences from the past are prioritized? Classic models of learning posit that events associated with unpredictable outcomes as well as, paradoxically, predictable outcomes, deploy more attention and learning for those events. Here, we test reinforcement learning and subsequent memory for those events, and treat signed and unsigned reward prediction errors (RPEs), experienced at the reward-predictive cue or reward outcome, as drivers of these two seemingly contradictory signals. By fitting reinforcement learning models to behavior, we find that both RPEs contribute to learning by modulating a dynamically changing learning rate. We further characterize the effects of these RPE signals on memory and show that both signed and unsigned RPEs enhance memory, in line with midbrain dopamine and locus-coeruleus modulation of hippocampal plasticity, thereby reconciling separate findings in the literature.

... read more

Topics: Reinforcement learning (54%)
Citations
  More

6 results found


Open accessPosted ContentDOI: 10.1002/WCS.1581
Abstract: Memories affect nearly every aspect of our mental life. They allow us to both resolve uncertainty in the present and to construct plans for the future. Recently, renewed interest in the role memory plays in adaptive behavior has led to new theoretical advances and empirical observations. We review key findings, with particular emphasis on how the retrieval of many kinds of memories affects deliberative action selection. These results are interpreted in a sequential inference framework, in which reinstatements from memory serve as "samples" of potential action outcomes. The resulting model suggests a central role for the dynamics of memory reactivation in determining the influence of different kinds of memory in decisions. We propose that representation-specific dynamics can implement a bottom-up "product of experts" rule that integrates multiple sets of action-outcome predictions weighted based on their uncertainty. We close by reviewing related findings and identifying areas for further research. This article is categorized under: Psychology > Reasoning and Decision Making Neuroscience > Cognition Neuroscience > Computation.

... read more

Topics: Episodic memory (56%), Cognition (54%), Action selection (52%)

8 Citations


Open accessPosted ContentDOI: 10.1101/LM.053410.121
Oded Bein1, Natalie A. Plotkin2, Lila Davachi2Institutions (2)
01 Nov 2021-Learning & Memory
Abstract: When our experience violates our predictions, it is adaptive to update our knowledge to promote a more accurate representation of the world and facilitate future predictions. Theoretical models propose that these mnemonic prediction errors should be encoded into a distinct memory trace to prevent interference with previous, conflicting memories. We investigated this proposal by repeatedly exposing participants to pairs of sequentially presented objects (A → B), thus evoking expectations. Then, we violated participants' expectations by replacing the second object in the pairs with a novel object (A → C). The following item memory test required participants to discriminate between identical old items and similar lures, thus testing detailed and distinctive item memory representations. In two experiments, mnemonic prediction errors enhanced item memory: Participants correctly identified more old items as old when those items violated expectations during learning, compared with items that did not violate expectations. This memory enhancement for C items was only observed when participants later showed intact memory for the related A → B pairs, suggesting that strong predictions are required to facilitate memory for violations. Following up on this, a third experiment reduced prediction strength prior to violation and subsequently eliminated the memory advantage of violations. Interestingly, mnemonic prediction errors did not increase gist-based mistakes of identifying old items as similar lures or identifying similar lures as old. Enhanced item memory in the absence of gist-based mistakes suggests that violations enhanced memory for items' details, which could be mediated via distinct memory traces. Together, these results advance our knowledge of how mnemonic prediction errors promote memory formation.

... read more

Topics: Mnemonic (54%)

2 Citations


Open accessJournal ArticleDOI: 10.1038/S41467-021-25126-0
Guo Wanjia1, Serra E. Favila2, Ghootae Kim, Robert J. Molitor1  +1 moreInstitutions (2)
Abstract: Remapping refers to a decorrelation of hippocampal representations of similar spatial environments. While it has been speculated that remapping may contribute to the resolution of episodic memory interference in humans, direct evidence is surprisingly limited. We tested this idea using high-resolution, pattern-based fMRI analyses. Here we show that activity patterns in human CA3/dentate gyrus exhibit an abrupt, temporally-specific decorrelation of highly similar memory representations that is precisely coupled with behavioral expressions of successful learning. The magnitude of this learning-related decorrelation was predicted by the amount of pattern overlap during initial stages of learning, with greater initial overlap leading to stronger decorrelation. Finally, we show that remapped activity patterns carry relatively more information about learned episodic associations compared to competing associations, further validating the learning-related significance of remapping. Collectively, these findings establish a critical link between hippocampal remapping and episodic memory interference and provide insight into why remapping occurs. When two memories are similar, their encoding and retrieval can be disrupted by each other. Here the authors show that memory interference is resolved through abrupt remapping of activity patterns in the human hippocampal CA3 and dentate gyrus.

... read more

Topics: Episodic memory (54%), Interference theory (52%)

1 Citations


Open accessJournal ArticleDOI: 10.1038/S42003-021-02426-1
23 Jul 2021-
Abstract: Learning signals during reinforcement learning and cognitive control rely on valenced reward prediction errors (RPEs) and non-valenced salience prediction errors (PEs) driven by surprise magnitude. A core debate in reward learning focuses on whether valenced and non-valenced PEs can be isolated in the human electroencephalogram (EEG). We combine behavioral modeling and single-trial EEG regression to disentangle sequential PEs in an interval timing task dissociating outcome valence, magnitude, and probability. Multiple regression across temporal, spatial, and frequency dimensions characterized a spatio-tempo-spectral cascade from early valenced RPE value to non-valenced RPE magnitude, followed by outcome probability indexed by a late frontal positivity. Separating negative and positive outcomes revealed the valenced RPE value effect is an artifact of overlap between two non-valenced RPE magnitude responses: frontal theta feedback-related negativity on losses and posterior delta reward positivity on wins. These results reconcile longstanding debates on the sequence of components representing reward and salience PEs in the human EEG. Hoy et al. combine behavioral modeling and single-trial EEG regression to disentangle sequential prediction errors in an interval timing task, which dissociated outcome valence, magnitude, and probability in human participants. Their study reconciles debates on the sequence of components representing reward and salience prediction errors in the human EEG.

... read more

Topics: Salience (neuroscience) (50%)

1 Citations


Posted ContentDOI: 10.1101/2021.10.26.465922
28 Oct 2021-bioRxiv
Abstract: The feedback people receive on their behavior shapes the process of belief formation and self-efficacy in mastering a given task. The neural and computational mechanisms of how the subjective value of these beliefs and corresponding affect bias the learning process are yet unclear. Here we investigate this question during learning of self-efficacy beliefs using fMRI, pupillometry, computational modeling and individual differences in affective experience. Biases in the formation of self-efficacy beliefs were associated with affect, pupil dilation and neural activity within the anterior insula, amygdala, VTA/SN, and mPFC. Specifically, neural and pupil responses map the valence of the prediction errors in correspondence to the experienced affect and learning bias people show during belief formation. Together with the functional connectivity dynamics of the anterior insula within this network our results hint towards neural and computational mechanisms that integrate affect in the process of belief formation.

... read more

Topics: Affect (psychology) (52%), Pupillometry (51%)

References
  More

70 results found


Open accessJournal ArticleDOI: 10.18637/JSS.V067.I01
Abstract: Maximum likelihood or restricted maximum likelihood (REML) estimates of the parameters in linear mixed-effects models can be determined using the lmer function in the lme4 package for R. As for most model-fitting functions in R, the model is described in an lmer call by a formula, in this case including both fixed- and random-effects terms. The formula and data together determine a numerical representation of the model from which the profiled deviance or the profiled REML criterion can be evaluated as a function of some of the model parameters. The appropriate criterion is optimized, using one of the constrained optimization functions in R, to provide the parameter estimates. We describe the structure of the model, the steps in evaluating the profiled deviance or REML criterion, and the structure of classes or types that represents such a model. Sufficient detail is included to allow specialization of these structures by users who wish to write functions to fit specialized linear mixed models, such as models incorporating pedigrees or smoothing splines, that are not easily expressible in the formula language used by lmer.

... read more

37,650 Citations


Open accessJournal ArticleDOI: 10.1214/AOS/1176344136
Abstract: The problem of selecting one of a number of models of different dimensions is treated by finding its Bayes solution, and evaluating the leading terms of its asymptotic expansion. These terms are a valid large-sample criterion beyond the Bayesian context, since they do not depend on the a priori distribution.

... read more

Topics: Bayesian information criterion (57%), g-prior (55%), Bayes' theorem (55%) ... show more

35,659 Citations


Open access
01 Jan 2005-
Abstract: The problem of selecting one of a number of models of different dimensions is treated by finding its Bayes solution, and evaluating the leading terms of its asymptotic expansion. These terms are a valid large-sample criterion beyond the Bayesian context, since they do not depend on the a priori distribution.

... read more

Topics: Bayes' theorem (56%), Context (language use) (54%), Asymptotic expansion (54%) ... show more

33,801 Citations


Open accessBook
Richard S. Sutton1, Andrew G. BartoInstitutions (1)
01 Jan 1988-
Abstract: Reinforcement learning, one of the most active research areas in artificial intelligence, is a computational approach to learning whereby an agent tries to maximize the total amount of reward it receives when interacting with a complex, uncertain environment. In Reinforcement Learning, Richard Sutton and Andrew Barto provide a clear and simple account of the key ideas and algorithms of reinforcement learning. Their discussion ranges from the history of the field's intellectual foundations to the most recent developments and applications. The only necessary mathematical background is familiarity with elementary concepts of probability. The book is divided into three parts. Part I defines the reinforcement learning problem in terms of Markov decision processes. Part II provides basic solution methods: dynamic programming, Monte Carlo methods, and temporal-difference learning. Part III presents a unified view of the solution methods and incorporates artificial neural networks, eligibility traces, and planning; the two final chapters present case studies and consider the future of reinforcement learning.

... read more

Topics: Learning classifier system (69%), Reinforcement learning (69%), Apprenticeship learning (65%) ... show more

32,257 Citations


Open accessJournal ArticleDOI: 10.1126/SCIENCE.275.5306.1593
Wolfram Schultz1, Peter Dayan2, P R Montague3Institutions (3)
14 Mar 1997-Science
Abstract: The capacity to predict future events permits a creature to detect, model, and manipulate the causal structure of its interactions with its environment. Behavioral experiments suggest that learning is driven by changes in the expectations about future salient events such as rewards and punishments. Physiological work has recently complemented these studies by identifying dopaminergic neurons in the primate whose fluctuating output apparently signals changes or errors in the predictions of future salient and rewarding events. Taken together, these findings can be understood through quantitative theories of adaptive optimizing control.

... read more

7,378 Citations


Performance
Metrics
No. of citations received by the Paper in previous years
YearCitations
20216