How we learn to make decisions: Rapid propagation of reinforcement learning prediction errors in humans

doi:10.1162/JOCN_A_00509

Journal Article•DOI•

How we learn to make decisions: Rapid propagation of reinforcement learning prediction errors in humans

Olave E. Krigolson¹, Cameron D. Hassall¹, Todd C. Handy²•Institutions (2)

Dalhousie University¹, University of British Columbia²

01 Mar 2014-Journal of Cognitive Neuroscience (MIT Press)-Vol. 26, Iss: 3, pp 635-644

TL;DR: The brain ERP technique is used to demonstrate that not only do rewards elicit a neural response akin to a prediction error but also that this signal rapidly diminished and propagated to the time of choice presentation with learning, further support that the computations that underlie human learning and decision-making follow reinforcement learning principles.

read less

Abstract: Our ability to make decisions is predicated upon our knowledge of the outcomes of the actions available to us. Reinforcement learning theory posits that actions followed by a reward or punishment acquire value through the computation of prediction errors-discrepancies between the predicted and the actual reward. A multitude of neuroimaging studies have demonstrated that rewards and punishments evoke neural responses that appear to reflect reinforcement learning prediction errors [e.g., Krigolson, O. E., Pierce, L. J., Holroyd, C. B., & Tanaka, J. W. Learning to become an expert: Reinforcement learning and the acquisition of perceptual expertise. Journal of Cognitive Neuroscience, 21, 1833-1840, 2009; Bayer, H. M., & Glimcher, P. W. Midbrain dopamine neurons encode a quantitative reward prediction error signal. Neuron, 47, 129-141, 2005; O'Doherty, J. P. Reward representations and reward-related learning in the human brain: Insights from neuroimaging. Current Opinion in Neurobiology, 14, 769-776, 2004; Holroyd, C. B., & Coles, M. G. H. The neural basis of human error processing: Reinforcement learning, dopamine, and the error-related negativity. Psychological Review, 109, 679-709, 2002]. Here, we used the brain ERP technique to demonstrate that not only do rewards elicit a neural response akin to a prediction error but also that this signal rapidly diminished and propagated to the time of choice presentation with learning. Specifically, in a simple, learnable gambling task, we show that novel rewards elicited a feedback error-related negativity that rapidly decreased in amplitude with learning. Furthermore, we demonstrate the existence of a reward positivity at choice presentation, a previously unreported ERP component that has a similar timing and topography as the feedback error-related negativity that increased in amplitude with learning. The pattern of results we observed mirrored the output of a computational model that we implemented to compute reward prediction errors and the changes in amplitude of these prediction errors at the time of choice presentation and reward delivery. Our results provide further support that the computations that underlie human learning and decision-making follow reinforcement learning principles.

...read moreread less

How we learn to make decisions: Rapid propagation of reinforcement learning prediction errors in humans

Citations

Cites background or methods or result from "How we learn to make decisions: Rap..."

References

"How we learn to make decisions: Rap..." refers background or methods in this paper

"How we learn to make decisions: Rap..." refers methods in this paper

"How we learn to make decisions: Rap..." refers background in this paper

"How we learn to make decisions: Rap..." refers background or methods in this paper

"How we learn to make decisions: Rap..." refers background or result in this paper

Related Papers (5)