scispace - formally typeset
Book ChapterDOI

Opponent Brain Systems for Reward and Punishment Learning: Causal Evidence From Drug and Lesion Studies in Humans

TLDR
The evidence for and against several hypotheses for the neural implementation of punishment learning are reviewed, focusing on human studies that compare the effects of neural perturbation, following drug administration and/or pathological conditions, on reward and punishment learning.
Abstract
Approaching rewards and avoiding punishments are core principles that govern the adaptation of behavior to the environment The machine learning literature has proposed formal algorithms to account for how agents adapt their decisions to optimize outcomes In principle, these reinforcement learning models could be equally applied to positive and negative outcomes, ie, rewards and punishments Yet many neuroscience studies have suggested that reward and punishment learning might be underpinned by distinct brain systems Reward learning has been shown to recruit midbrain dopaminergic nuclei and ventral prefrontostriatal circuits The picture is less clear regarding the existence and anatomy of an opponent system: several hypotheses have been formulated for the neural implementation of punishment learning In this chapter, we review the evidence for and against each hypothesis, focusing on human studies that compare the effects of neural perturbation, following drug administration and/or pathological conditions, on reward and punishment learning

read more

Citations
More filters
Journal ArticleDOI

The Computational Development of Reinforcement Learning during Adolescence.

TL;DR: The developmental time-course of the computational modules responsible for learning from reward or punishment, and learning from counterfactual feedback are traced, finding that adolescents learned from reward but were less likely to learn from punishment.
Journal ArticleDOI

Task errors contribute to implicit aftereffects in sensorimotor adaptation

TL;DR: As participants adapted to a 30° rotation of cursor feedback representing hand position, the role of task errors in sensorimotor adaptation was investigated, suggesting that the system which predicts the sensory consequences of actions via exposure to sensory prediction errors is also sensitive to reward prediction errors.
Journal ArticleDOI

Assessing inter-individual differences with task-related functional neuroimaging.

TL;DR: It is shown that researchers often assess how activations elicited by a variable of interest differ between individuals, and it is argued that the rationale for such analyses offers an over-large analytical and interpretational flexibility that undermines their validity.
Journal ArticleDOI

The effects of reward and punishment on motor skill learning

TL;DR: Novel laboratory-based motor skill learning tasks that allow the effects of reward/punishment on selection and execution to be examined independently are required.
Journal ArticleDOI

Contextual influence on confidence judgments in human reinforcement learning.

TL;DR: The effect of outcome valence (gains or losses) on confidence while participants learned stimulus-outcome associations by trial-and-error is investigated, showing that one such consequence emerges in volatile environments, where the (in)flexibility of individuals’ learning strategies differs when outcomes are framed as gains or losses.
References
More filters
Book

Reinforcement Learning: An Introduction

TL;DR: This book provides a clear and simple account of the key ideas and algorithms of reinforcement learning, which ranges from the history of the field's intellectual foundations to the most recent developments and applications.
Journal ArticleDOI

Technical Note : \cal Q -Learning

TL;DR: This paper presents and proves in detail a convergence theorem forQ-learning based on that outlined in Watkins (1989), showing that Q-learning converges to the optimum action-values with probability 1 so long as all actions are repeatedly sampled in all states and the action- values are represented discretely.
Journal ArticleDOI

A Neural Substrate of Prediction and Reward

TL;DR: Findings in this work indicate that dopaminergic neurons in the primate whose fluctuating output apparently signals changes or errors in the predictions of future salient and rewarding events can be understood through quantitative theories of adaptive optimizing control.
Journal ArticleDOI

Technical Note Q-Learning

TL;DR: In this article, it is shown that Q-learning converges to the optimum action-values with probability 1 so long as all actions are repeatedly sampled in all states and the action values are represented discretely.
Related Papers (5)