scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Dopamine restores reward prediction errors in old age

TL;DR: The dopamine precursor levodopa (L-DOPA) increased the task-based learning rate and task performance in some older adults to the level of young adults and was linked to restoration of a canonical neural RPE.
Abstract: Senescence affects the ability to utilize information about the likelihood of rewards for optimal decision-making. Using functional magnetic resonance imaging in humans, we found that healthy older adults had an abnormal signature of expected value, resulting in an incomplete reward prediction error (RPE) signal in the nucleus accumbens, a brain region that receives rich input projections from substantia nigra/ventral tegmental area (SN/VTA) dopaminergic neurons. Structural connectivity between SN/VTA and striatum, measured by diffusion tensor imaging, was tightly coupled to inter-individual differences in the expression of this expected reward value signal. The dopamine precursor levodopa (L-DOPA) increased the task-based learning rate and task performance in some older adults to the level of young adults. This drug effect was linked to restoration of a canonical neural RPE. Our results identify a neurochemical signature underlying abnormal reward processing in older adults and indicate that this can be modulated by L-DOPA.

Summary (2 min read)

Introduction

  • Older adults are particularly poor at making decisions when faced with probabilistic rewards, possibly because of impaired learning of stimulus-outcome contingencies1,2.
  • Thus, the authors predicted that L-DOPA would increase the learning rate evident in behavior as well as boost the representation of an RPE in the nucleus accumbens of healthy older adults, specifically by increasing the component associated with the expected value.
  • Dopamine restores reward prediction errors in old age.

Reinforcement learning behavior

  • The authors analyzed trial-by-trial choice behavior using a standard reinforcement learning model with a fixed β parameter (Fig. 2a).
  • (a) On each trial, participants selected one of two fractal images, which were then highlighted in a red frame.
  • Error bars represent ±1 s.e.m. (c) Older adults who won more on L-DOPA than placebo (n = 15) had a significantly higher learning rate under L-DOPA than placebo, whereas learning rates did not differ between placebo and L-DOPA for older adults who won less on L-DOPA than placebo (n = 17).
  • Error bars represent ±1 s.e.m. (a) A region in the right nucleus accumbens showed greater BOLD activity for reward (R) than for expected value (Q) at the time of outcome (putative RPE).
  • L-DOPA increased the negative effect of expected value (paired t test, **P < 0.05, two tailed), resulting in a canonical prediction error signal (both a positive effect of reward and negative effect of expected value).

Anatomical connectivity and RPEs

  • The authors analysis identified substantial inter-individual variability among older adults for both reward and expected value representations in the nucleus accumbens at baseline (that is, under placebo; Supplementary Fig. 5), whereby the latter was associated with task performance.
  • Using DTI and probabilistic tractography (n = 30 older adults), the authors defined a measure of connection strength between the right SN/VTA and right striatum (Supplementary Fig. 6 and Online Methods).
  • Neither fractional anisotropy values of SN/VTA nor nucleus accumbens functional ROI correlated with expected value (Pearson’s r = 0.26 and r = 0.17, P = 0.16 and P = 0.38, respectively), suggesting that this correlation was related to circuit strength rather than to local structural integrity as determined by fractional anisotropy.
  • This suggests that these older individuals had higher baseline integrity of the nigro-striatal dopamine circuit than older adults with lower baseline levels of performance.

DISCUSSION

  • The authors used a probabilistic reinforcement learning task in combination with a pharmacological manipulation of dopamine, as well as structural np g © 2 01 3 N at ur e A m er ic a, In c.
  • Together with recent findings38, these results raise the possibility that dopamine might only modulate the neural representation of expected value when it is behaviorally relevant for the task at hand.
  • The connectivity strength of tracts is one DTI metric that has been reported to predict age-related performance differences39,40.
  • In summary, their findings suggest that a subgroup of older adults who underperform at baseline can show a drug-induced improvement in task performance.
  • For these older adults, L-DOPA increased a task-based learning rate and led to a canonical RPE signal by restoring the representation of expected value in the nucleus accumbens.

METhODS

  • Methods and any associated references are available in the online version of the paper.
  • Supplementary information is available in the online version of the paper, also known as Note.

AcknowledgmenTS

  • The authors thank J. Medhora and L. Sasse for their assistance with data collection, and H. Barron and M. Klein-Flügge for their assistance with time course analyses.
  • R.C. is supported by a Wellcome Trust Research Training Fellowship (WT088286MA).
  • The Wellcome Trust Centre for Neuroimaging is supported by core funding from the Wellcome Trust (091593/Z/10/Z).

AUTHoR conTRIBUTIonS

  • R.C. and M.G.-M. conducted the experiment, analyzed the data and prepared the manuscript.
  • Eppinger, B., Hämmerer, D. & Li, S.-C. Neuromodulation of reward-based learning and decision making in human aging.
  • Samanez-Larkin, G.R., Wagner, A.D. & Knutson, B. Expected value information improves financial risk taking across the adult life span.
  • Jocham, G., Klein, T.A. & Ullsperger, M. Dopamine-mediated reinforcement learning signals in the striatum and ventromedial prefrontal cortex underlie value-based choices.

ONLINE METhODS

  • 32 healthy adults aged 65–75 years participated in the study (Supplementary Table 1).
  • Separating the RPE into its components has often not been done, although recent studies have shown that, as R(t) is highly correlated with the RPE, a correlation between BOLD signal and R(t) – Qa(t)(t) may lead to false positive results suggesting that areas whose BOLD signal only correlates with R(t) may be thought of as representing an RPE.
  • The authors also performed separate multiple regression analyses for the same contrasts on the placebo and L-DOPA conditions separately using task performance (total won) as a between-subjects regressor.
  • The main aim of this analysis was to visualize the effect of reward and expected value on the BOLD signal, at the time of the choice and at the time of the outcome, from the nucleus accumbens functional ROI over the course of a trial.

Did you find this useful? Give us your feedback

Content maybe subject to copyright    Report

6 4 8 VOLUME 16 | NUMBER 5 | MAY 2013 nature neurOSCIenCe
a r t I C l e S
Aging in humans is associated with a range of changes in cognition. For
example, older adults are particularly poor at making decisions when
faced with probabilistic rewards, possibly because of impaired learning
of stimulus-outcome contingencies
1,2
. Such findings raise two funda-
mental questions. Namely, what are the substrates for learning in these
circumstances and what accounts for this aberrant decision-making?
One function that is critical for decision-making is learning to
predict rewards. There is ample evidence from animal experiments
that the neuromodulator dopamine encodes the difference between
actual and expected rewards (so-called RPEs)
3,4
. In humans, there
is compelling evidence that functional activation patterns in the
nucleus accumbens, a major target region of dopamine neurons
5
,
report rewarding outcomes and associated prediction errors
6–9
.
A more direct link to dopamine has been shown using pharmacological
challenge with dopaminergic agents
10,11
.
In terms of what might go wrong during aging, one important clue
is the well-described age-related loss of dopamine neurons in the SN/
VTA
12,13
, evident both in histology and when using diffusion tensor
imaging (DTI) as an indirect marker of structural degeneration
14,15
.
However, the consequences for decision-making of this decline in
dopamine are unclear, as there are functional interactions among the
triplet of reward representations, representations of prediction errors
associated with that reward and the learning of predictions that under-
pins the expression of these prediction errors. In older age, abnormal
activity in the nucleus accumbens has been associated with suboptimal
decision-making and reduced reward anticipation, but also with normal
responses to rewarding outcomes
16–18
. This has led to the suggestion
that, although older adults may maintain adequate representations of
reward, they are unable to learn correctly from these representations.
We studied the effect of probabilistic rewarding outcomes on the
separate reward and prediction components of a prediction error
signal
19
in healthy older adults. We employed a simple probabilistic
instrumental conditioning problem, the two-armed bandit choice
task (Fig. 1a). Older adults underwent DTI and functional magnetic
resonance imaging (fMRI) in combination with a pharmacological
manipulation using the dopamine precursor L-DOPA in a within-
subject, double-blind, placebo-controlled study. We collected behav-
ioral data in a group of young adults to contextualize the effects of
age on performance. We did not administer L-DOPA to these young
adults, implying that the effects of L-DOPA could not be compared
across age groups. By exploiting a reinforcement learning model, we
were able to determine which component of the prediction error (the
actual and/or expected reward representation) was impaired in older
age. DTI enabled us to examine nigro-striatal structural connectivity
strength, based on the hypothesis that individual differences in this
structural measure would predict inter-individual differences in base-
line functional RPE signaling. Crucially, L-DOPA administration has
been associated with greater prediction errors in young adults
10
and
higher learning rates in patients with Parkinsons disease
11
. Thus, we
predicted that L-DOPA would increase the learning rate evident in
behavior as well as boost the representation of an RPE in the nucleus
accumbens of healthy older adults, specifically by increasing the
component associated with the expected value.
RESULTS
Behavioral performance in young and older adults
We administered placebo and L-DOPA to 32 older adults (age = 70.00 ±
3.24 years, mean ± s.d.; Supplementary Table 1) and asked them,
1
Institute of Cognitive Neuroscience, University College London, London, UK.
2
Wellcome Trust Centre for Neuroimaging, University College London, London, UK.
3
Aging Research Center, Karolinska Institute, Stockholm, Sweden.
4
Stroke and Dementia Research Centre, St. George’s University of London, London, UK.
5
Gatsby
Computational Neuroscience Unit, University College London, London, UK.
6
Translational Neuroimaging Unit, Department of Biological Engineering, ETH Zurich and
University of Zurich, Zurich, Switzerland.
7
Department of Psychiatry, Psychotherapy and Psychosomatics, Zurich University Hospital of Psychiatry, Zurich, Switzerland.
8
Otto-von-Guericke-University Magdeburg, Institute of Cognitive Neurology and Dementia Research, Magdeburg, Germany.
9
German Center for Neurodegenerative
Diseases (DZNE), Magdeburg, Germany.
10
These authors contributed equally to this work. Correspondence should be addressed to R.C. (rumana.neuro@gmail.com).
Received 6 January; accepted 23 February; published online 24 March 2013; doi:10.1038/nn.3364
Dopamine restores reward prediction errors in old age
Rumana Chowdhury
1,2,10
, Marc Guitart-Masip
2,3,10
, Christian Lambert
2,4
, Peter Dayan
5
, Quentin Huys
2,5–7
,
Emrah Düzel
1,8,9
& Raymond J Dolan
2
Senescence affects the ability to utilize information about the likelihood of rewards for optimal decision-making. Using functional
magnetic resonance imaging in humans, we found that healthy older adults had an abnormal signature of expected value,
resulting in an incomplete reward prediction error (RPE) signal in the nucleus accumbens, a brain region that receives rich
input projections from substantia nigra/ventral tegmental area (SN/VTA) dopaminergic neurons. Structural connectivity between
SN/VTA and striatum, measured by diffusion tensor imaging, was tightly coupled to inter-individual differences in the expression
of this expected reward value signal. The dopamine precursor levodopa (L-DOPA) increased the task-based learning rate and task
performance in some older adults to the level of young adults. This drug effect was linked to restoration of a canonical neural
RPE. Our results identify a neurochemical signature underlying abnormal reward processing in older adults and indicate that this
can be modulated by L-DOPA.
npg
© 2013 Nature America, Inc. All rights reserved.

nature neurOSCIenCe VOLUME 16 | NUMBER 5 | MAY 2013 6 4 9
a r t I C l e S
as well as 22 young adults (age = 25.18 ± 3.85 years), to perform a
two-armed bandit choice task (Fig. 1a). Older adults completed a
similar number of trials under both conditions (placebo: 218.16 ±
1.94; L-DOPA: 218.47 ± 1.74) as young adults (218.50 ± 2.44) (all
P > 0.4). Older adults had similar choice reaction times after placebo
(796.81 ± 152.89 ms) and L-DOPA (781.49 ± 140.17 ms) treatment
(paired t test, t
31
= 1.01, P = 0.321), but were slower overall under
both conditions than young adults (629.69 ± 156.41 ms) (independ-
ent t tests, young versus old + placebo, t
52
= 3.91; young versus old +
L-DOPA, t
52
= 3.73; both P < 0.0005).
Overall, the amount of money won by older adults performing the
task did not differ following L-DOPA treatment (£12.94 ± 0.81) com-
pared to placebo treatment (£12.64 ± 0.89) (paired t test, t
31
= 1.53,
P = 0.137). However, older adults on placebo won significantly less
money than young adults (£13.17 ± 1.00; independent samples t test,
t
52
= 2.05, P = 0.045), whereas there was no difference in the amount
won between older adults treated with L-DOPA and young adults
(t
52
= 0.971, P = 0.336) (Fig. 1b).
A more detailed examination of the behavioral data showed that
only a proportion of older adults won more money on the task
under L-DOPA compared to placebo. To examine this further, we
performed a median split according to drug-induced changes in per-
formance (Online Methods), creating a ‘win less on L-DOPA’ group
(total won L-DOPA < placebo, n = 17) and a ‘win more on L-DOPA
group (total won L-DOPA > placebo, n = 15). This analysis revealed
that performance in older adults was consistent with an inverted
U shape, whereby those with high baseline levels of performance
on placebo performed less well on L-DOPA and, conversely, those
with low baseline levels of performance improved following L-DOPA
treatment (Supplementary Fig. 1). Performance in the win less on L-
DOPA group on placebo and in the win more on L-DOPA group on
L-DOPA was at a similar level to performance in young adults (young
adults versus win less on L-DOPA group on placebo, t
37
= 0.19, P =
0.854; young adults versus win more on L-DOPA group on L-DOPA,
t
35
= 0.40, P = 0.690), whereas performance in the win less on L-
DOPA group on L-DOPA and in the win more on L-DOPA group on
placebo was worse than performance in young adults (young adults
versus win less on L-DOPA group on L-DOPA t
37
= 2.07, P = 0.045;
young adults versus win more on L-DOPA on placebo, t
35
= 3.53, P =
0.001). This inverted U-shaped pattern of performance is consistent
with previous reports of the effects of dopamine on cognition
20
and
suggests that variable performance across older adults is linked to
individual differences in baseline dopamine status.
Reinforcement learning behavior
We analyzed trial-by-trial choice behavior using a standard reinforcement
learning model with a fixed β parameter (Fig. 2a). Note that, by using this
methodological approach, the learning rate reflects a summary measure of
reinforcement learning strength (Online Methods). A model with a single
fixed β = 1.27 across drug and placebo conditions, one single learning rate
and one choice perseveration parameter provided the best model fit of older
participantschoices among the range of models that we compared, indexed
by the lowest Bayesian information criterion (BIC) values (Supplementary
Table 2). When calculating the BIC, the log evidence was penalized using
the number of data points associated with each parameter.
To further examine the effects of L-DOPA on older participants
behavior in the task, we used the Wilcoxon signed rank test to deter-
mine whether the learning rates (fitted using a single prior distribution
including the drug and the placebo) differed between L-DOPA and
placebo. We found that participants had a significantly higher learning
rate under L-DOPA than placebo (Z = −3.03, P = 0.002;
Fig. 2b). This
effect was significant in the group of older adults who performed better
under L-DOPA (win more on L-DOPA group, placebo versus L-DOPA:
Z = −2.90, P = 0.004), but not among older adults who performed worse
on L-DOPA (win less on L-DOPA group, placebo versus L-DOPA:
Z = −0.97, P = 0.332), providing a direct link between the effects of
L-DOPA and task performance (Fig. 2c). In contrast, choice persevera-
tion was unaffected by L-DOPA (Z = −0.58, P = 0.562). In young adults,
a model with a fixed β = 1.13 and single learning rate provided a better
fit to participants’ choices than when a choice perseveration parameter
was added to the model (BIC = 4,348.15 and 4,361.01, respectively).
The learning rate in young adults (median α = 0.62, range 0.01–0.94)
was intermediate between, and not significantly different from, the
learning rate of older adults with either placebo (Z = −1.32, P = 0.187)
or L-DOPA (Z = −1.25, P = 0.211) (Fig. 2b).
L-DOPA and striatal prediction errors in older adults
We focused our imaging analysis on within-subject comparisons
of reward predictions errors in the nucleus accumbens (n = 32
older adults). Using a functional region of interest (ROI) approach
Figure 1 Two-armed bandit task design and
performance in young and older adults. (a) On
each trial, participants selected one of two
fractal images, which were then highlighted in
a red frame. This was followed by an outcome
in which a green upward arrow indicated a win
of £0.10 and a yellow horizontal bar indicated
the absence of a win. If they did not choose
a stimulus, the written message “you did not
choose a picture” was displayed. The same
pair of images was used throughout the task,
although their position on the screen (left or
right) varied. The task consisted of 220 trials
separated into two sessions with a short break in
between. Participants’ earnings were displayed
at the end of the task and given to them at the
end of the test day. The probability of obtaining
a reward associated with each image varied on
a trial-by-trial basis according to a Gaussian
random walk. Two different sets of probability distributions (set A and B) were used on the two testing days, counterbalanced across the order of L-DOPA
or placebo administration. RT, reaction time. (b) Older adults (n = 32) in the placebo condition won less money than young adults (n = 22). When the
same older adults (n = 32) received L-DOPA, performance was similar to young adults. *P < 0.05. Error bars indicate ±1 s.e.m.
a b
14.00
13.00
12.00
Total won (£)
0
Older
placebo
Older L-DOPA
Young
*
or
+
RT
(max
2,000
ms)
3,000 ms
minus RT
1,000 ms
1,500 ± 500 ms
Set B
Trial
50 100 150 200
1.0
0.8
0.6
0.4
0.2
0
Probability of
winning
1.0
0.8
0.6
0.4
0.2
0
50 100
Set A
Trial
150 200
Probability of
winning
npg
© 2013 Nature America, Inc. All rights reserved.

6 5 0 VOLUME 16 | NUMBER 5 | MAY 2013 nature neurOSCIenCe
a r t I C l e S
(Supplementary Fig. 2), we first defined voxels in the nucleus
accumbens that signaled a ‘putativeprediction error, namely voxels
in which there was an enhanced response at the time of outcome
to actual rewards that was greater than that to expected rewards
(R(t) > Q
a(t)
(t); see Online Methods). Using this approach, we identi-
fied a cluster in the right nucleus accumbens (peak voxel MNI coor-
dinates: x, y, z = 15, 11, −8; peak Z = 4.45, P < 0.001 uncorrected,
34 voxels; Fig. 3a). Note that this is a liberal definition of RPEs, as
voxels showing a significant effect with this contrast may not satisfy
all of the criteria to be considered for a canonical RPE, namely both
a positive effect of reward and a negative effect of expected value
19,21
.
We adopted this approach to test the hypothesis that canonical RPEs
are not fully represented in old age and to test for the orthogonal
effects of L-DOPA on the separate reward and expected value com-
ponents of the prediction error signal.
We used this anatomically constrained functional ROI to separately
extract the parameter estimates for R(t) and Q
a(t)
(t) in these activated
voxels. Our two (placebo or L-DOPA) by two
(R(t) or Q
a(t)
(t)) repeated-measures ANOVA
revealed a main effect of L-DOPA (F
1,31
=
5.712, P = 0.023), suggesting administration
of L-DOPA had an effect on the representa-
tions associated with the components of the
RPE (Fig. 3a). Notably, blood oxygen level
dependent (BOLD) responses were only
compatible with a canonical prediction error
signal (positive correlation between BOLD
and R(t) along with a negative correlation
between BOLD and Q
a(t)
(t)) when participants
were under L-DOPA (one-tailed one-sample t
test: R(t) L-DOPA, t = 1.92, P = 0.033; Q
a(t)
(t)
L-DOPA, t = −1.73, P = 0.047; R(t) placebo, t =
3.72, P < 0.001; Q
a(t)
(t) placebo, t = −0.11, P =
0.455). This was a result of a more negative
representation of expected value Q
a(t)
(t) on
L-DOPA compared with placebo (paired
t test, t
31
= 2.37, P = 0.024), whereas there
was no difference in actual reward representa-
tion R(t) between L-DOPA and placebo (t
31
=
1.38, P = 0.179). These results indicate that
canonical RPEs are not fully represented in older adults at baseline,
whereby, under placebo, the nucleus accumbens responds to reward
and not to expected value. Only after receipt of L-DOPA was a
canonical RPE signal observed.
Under placebo, individual differences in the total amount won on the
task correlated positively with the learning rate (Spearmans
ρ
= 0.39,
P = 0.027) and task performance correlated negatively with the
BOLD representation of expected value (Q
a(t)
(t), Pearsons r = −0.42,
P = 0.016), although this was not the case with reward (R(t), Pearsons r =
−0.07, P = 0.707). Thus, better baseline performance was associated
with a higher learning rate and more negative expected value repre-
sentations in the nucleus accumbens. Across all 32 older participants,
task performance on L-DOPA did not correlate with the learning rate
or BOLD representations of reward or expected value (all P > 0.15;
Supplementary Table 3).
However, subsequent analysis on the basis of a median split for the
effects of drug on performance revealed that expected value (Q
a(t)
(t))
b ca
1.0
0.8
0.6
0.4
0 50 100
Trial number
Older: set A
Young: set A
Young: set B
Older: set B
150 200
Probability of choosing
stimulus
0.2
0
1.0
0.8
0.6
0.4
0 50 100
Trial number
150 200
Probability of choosing
stimulus
0.2
0
1.0
0.8
0.6
0.4
0 50 100
Trial number
150 200
Probability of choosing
stimulus
0.2
0
1.0
0.8
0.6
0.4
0 50 100
Trial number
150 200
Probability of choosing
stimulus
0.2
0
Subject choices
Simulated choices
0.80
0.70
*
0.60
0.50
Learning rate
0.40
Older placebo
Young
Older L-DOPA
0
Placebo
L-DOPA
0.80
0.70
*
0.60
0.50
Learning rate
0.40
Win less
on L-DOPA
Win more
on L-DOPA
0
Older
Figure 2 Reinforcement learning model and behavior. (a) For young and older adults, the predicted
choices from the learning model (red) closely matched subjects’ observed choices (blue). The red lines
show the same time-varying probabilities, but evaluated on choices sampled from the model (Online
Methods). Plots are shown for the two different sets of probability distributions used on the two test days.
(b) Older adults (n = 32) had a higher learning rate under L-DOPA compared with placebo and did not
differ from young adults (n = 22). *P < 0.05, two-tailed. Error bars represent ±1 s.e.m. (c) Older adults
who won more on L-DOPA than placebo (n = 15) had a significantly higher learning rate under L-DOPA
than placebo, whereas learning rates did not differ between placebo and L-DOPA for older adults who
won less on L-DOPA than placebo (n = 17). *P < 0.05, two-tailed t test. Error bars represent ±1 s.e.m.
Figure 3 Reward prediction in the nucleus accumbens in 32 older
adults. (a) A region in the right nucleus accumbens showed greater BOLD
activity for reward (R) than for expected value (Q) at the time of outcome
(putative RPE). However, the lack of a negative effect of expected value
under placebo meant this prediction error signal was incomplete
(*P < 0.05, one-sample t test, one-tailed). L-DOPA increased the negative
effect of expected value (paired t test, **P < 0.05, two tailed), resulting
in a canonical prediction error signal (both a positive effect of reward
and negative effect of expected value). Error bars represent ±1 s.e.m.
(b) Participants who won more on L-DOPA (n = 15) demonstrated a
negative effect of expected value under L-DOPA and not under placebo.
Reward and expected value parameter estimates did not differ between
L-DOPA and placebo for participants who won less on L-DOPA (n = 17).
**P < 0.05, paired t test. Error bars represent ±1 s.e.m. (c) Time
course plots of the nucleus accumbens BOLD response to reward and
expected value. White box corresponds with BOLD responses elicited at
the time participants’ made a choice; gray box corresponds with BOLD
responses elicited when the outcomes were revealed. Under placebo,
the only reliable signal observed was a reward response. Under L-DOPA,
a canonical RPE was observed, involving a positive expectation of value
at the time of the choice together with a positive reward response and a
negative expectation of value at the time of the outcome. Reward anticipation (positive effect at the time of the choice) was only observed on L-DOPA.
Solid lines are group means of the effect sizes and shaded areas represent ±1 s.e.m.
c
Placebo
0 3
Stimulus Outcome
Time (s)
6 9 12 15
Effect size (a.u.)
0.1
0
–0.1
–0.2
Expected
value
Reward
L-DOPA
0 3
Stimulus Outcome
Time (s)
6 9 12 15
Effect size (a.u.)
0.1
0
–0.1
–0.2
Expected
value
Reward
a
2.00
*
*
*
**
Parameter estimates
1.00
0
–1.00
–2.00
R placebo
Q placebo
R L-DOPA
Q L-DOPA
b
**
2.00
Parameter estimates
1.00
0
–1.00
–2.00
Win less on
L-DOPA
Win more on
L-DOPA
R placebo
Q placebo
R L-DOPA
Q L-DOPA
npg
© 2013 Nature America, Inc. All rights reserved.

nature neurOSCIenCe VOLUME 16 | NUMBER 5 | MAY 2013 6 5 1
a r t I C l e S
parameter estimates in older adults who performed better on L-DOPA
were significantly more negative on L-DOPA than placebo (win more
on L-DOPA group, Q
a(t)
(t), placebo versus L-DOPA, t
14
= 2.26, P =
0.040; Fig. 3b). In contrast, L-DOPA did not affect expected value
representation in the win less on L-DOPA group (t
16
= 1.18, P = 0.257)
or reward representation in either the win less on L-DOPA (t
16
= 1.56,
P = 0.137) or win more on L-DOPA groups (t
14
= 0.48, P = 0.637).
These results indicate that the restoration of a canonical prediction
error signal, mediated by a more negative representation of expected
value under L-DOPA, is associated with better task performance.
Although L-DOPA did not affect reward or expected value param-
eter estimates of those older participants in the win less on L-DOPA
group, these participants continued to show a negative BOLD correlate
of Q
a(t)
(t) under L-DOPA even though their performance was worse on
L-DOPA (Fig. 3b). One possibility was a differential effect of L-DOPA on
the noise in the representations of R(t) and Q
a(t)
(t) for these participants.
To address this, we measured individualsstandard error of the param-
eter estimates on L-DOPA and placebo. We found a significant negative
correlation between the drug-induced change in total won on the task
and the drug-induced change in the standard error of R(t) (Spearmans
ρ = –0.62, P = 0.009) and Q(
a(t)
(t) (Spearmans ρ = –0.61, P = 0.009)
only in the win less on L-DOPA group (Supplementary Fig. 3). This
suggests that, for participants with high baseline levels of performance,
L-DOPA increased noise in their reward and expected value representa-
tions and this was associated with a worsening in performance. Notably,
the increase in noise in the BOLD responses was not related to worse fits
of the reinforcement learning models, as mean model likelihood did not
differ between groups and did not correlate with standard error of R(t)
and Q
a(t)
(t) (Supplementary Table 4).
To visualize the effects of L-DOPA on reward prediction over the course
of a trial, we extracted the BOLD time course from the nucleus accumbens
functional ROI and performed a regression of this fMRI signal against
R(t) and Q
a(t)
(t). Typically, we would expect to see a pattern of a reward
prediction (that is, anticipation) at the time of the choice indicated by a
positive effect of Q
a(t)
(t) and an RPE at the time of the outcome, indicated
by both a positive effect of R(t) and negative effect of Q
a(t)
(t). Our time
course analysis revealed exactly this expected pattern, but only in the
L-DOPA condition (Fig. 3c). Thus, the abnormal response to the expected
value observed among older adults on placebo (lack of reward anticipation
at the time of the choice and absent negative expectation at the time of
the outcome) was restored under L-DOPA. This analysis complements
the aforementioned fMRI analysis, which showed that a canonical RPE
was only present on L-DOPA, by revealing abnormal expected value
representations throughout the course of a trial under placebo.
In addition, we performed further multiple regression analyses across
all older adults to identify regions in the brain in which reward, expected
value and putative RPEs correlated with task performance (total money
won) separately for L-DOPA and placebo conditions. Of note, only a
model examining negative correlations between expected value and
performance identified regions that survived family-wise error whole-
brain correction (Supplementary Fig. 4). We found a left superior
parietal cluster in the placebo condition (Z = 5.06, peak voxel MNI
coordinates: −26, −78, 50), and left inferior parietal (Z = 5.34, peak
voxel MNI coordinates: −48, −49, 48) and right precuneus clusters
(Z = 5.02, peak voxel MNI coordinates: 12, −72, 59) in the L-DOPA
condition. This suggests that extra-striatal regions also influenced task
performance, whereby individuals with a more negative representation
of expected value in parietal regions won more money on the task.
Anatomical connectivity and RPEs
Our analysis identified substantial inter-individual variability among
older adults for both reward and expected value representations in the
nucleus accumbens at baseline (that is, under placebo; Supplementary
Fig. 5), whereby the latter was associated with task performance. We
hypothesized that this might be associated with the known variabil-
ity in the age-related decline of dopamine neurons from the SN/VTA,
which may, in principle, be indexed through anatomical nigro-striatal
connectivity. Using DTI and probabilistic tractography (n = 30 older
adults), we defined a measure of connection strength between the
right SN/VTA and right striatum (Supplementary Fig. 6 and Online
Methods). Nigro-striatal tract connectivity strength measured with DTI
correlated with the fMRI parameter estimate under placebo associ-
ated with expected value (Q
a(t)
(t)) (Spearmans
ρ
= −0.46, P = 0.010),
but not with that associated with reward (R(t)) (Spearmans
ρ
= 0.12,
P = 0.54) (Fig. 4). These correlations were significantly different from
each other, suggesting that individual functional activation differences
of the representation of expected value, but not reward, were linked to
anatomical connectivity strength between the SN/VTA and striatum
(Fishers r-to-z transformation, z = −2.32, P = 0.002). This relationship
between greater tract connectivity strength and more negative expected
value parameter estimates remained significant after controlling for age,
gender, total intracranial volume, size of the seed region from which
tractography was performed and global white matter integrity indexed
by fractional anisotropy (partial Spearmans
ρ
= −0.44, P = 0.027). There
was no difference in this correlation between subgroups of older adults
(win more on L-DOPA group, n = 14,
ρ
= −0.54, P = 0.047; win less on
L-DOPA group, n = 16,
ρ
= −0.37, P = 0.154; Fisher’s r-to-z transfor-
mation comparing both groups, z = 0.53, P = 0.596; Supplementary
Fig. 7). Neither fractional anisotropy values of SN/VTA nor nucleus
accumbens functional ROI correlated with expected value (Pearsons
r = 0.26 and r = 0.17, P = 0.16 and P = 0.38, respectively), suggesting
that this correlation was related to circuit strength rather than to local
structural integrity as determined by fractional anisotropy.
Older participants with equivalent baseline performance levels to
young adults (win less on L-DOPA group on placebo) had stronger
connectivity between SN/VTA and the striatum than older partici-
pants with worse baseline performance than young adults (win more
on L-DOPA group on placebo; between groups comparison t
29
= 2.40,
P = 0.023). This suggests that these older individuals had higher base-
line integrity of the nigro-striatal dopamine circuit than older adults
with lower baseline levels of performance.
DISCUSSION
We used a probabilistic reinforcement learning task in combination
with a pharmacological manipulation of dopamine, as well as structural
Spearman’s = 0.12
P = 0.54
Spearman’s = –0.46
P = 0.01
0.006
0.005
0.004
0.003
0.002
–2.0 0 2.0 4.0 6.0
0.006
0.005
0.004
0.003
0.002
–8.0 –4.0 0 4.0 8.0
Reward parameter estimates Expected value parameter estimates
Connectivity strength
Connectivity strength
Figure 4 Nigro-striatal tract connectivity strength and functional
prediction errors. Under placebo, individuals with higher white matter
nigro-striatal tract connectivity strength (determined using DTI) had a
more negative effect of expected value, whereas there was no correlation
with functional parameters estimates of reward. Each dot on the plots
represents one subject (n = 30, note two participants are overlapping on
the plot on the left), the solid line is the regression slope, and the dashed
lines represent 95% confidence intervals.
npg
© 2013 Nature America, Inc. All rights reserved.

6 5 2 VOLUME 16 | NUMBER 5 | MAY 2013 nature neurOSCIenCe
a r t I C l e S
and functional imaging, to probe reward-based decision-making
in old age. Overall, older adults had an incomplete RPE signal in
the nucleus accumbens consequent on a lack of a neuronal response
to expected reward value. Baseline inter-individual differences of
the expression of expected value were linked to performance and
tightly coupled to nigro-striatal structural connectivity strength,
determined using DTI. L-DOPA increased the task-based learning
rate and modified the BOLD representation of expected value in the
nucleus accumbens. Notably, this effect was only observed for those
participants that showed a substantial drug-induced improvement
on task performance.
Previous studies have shown that older adults perform worse on
probabilistic learning tasks than their younger counterparts
2,22,23
. As
it is widely held that dopamine neurons encode an RPE signal, it is
conceivable that a dopamine decline, occurring as part of the normal
aging process, could account for these behavioral deficits. Indeed, this
was a prime motivation for our use of L-DOPA. Although there was
no significant difference in task performance in older adults as a group
on placebo versus L-DOPA, we found that older adults with low base-
line levels of performance improved following L-DOPA treatment.
Using a reinforcement-learning model, we found that those older
adults who performed better under L-DOPA had a higher learning
rate on L-DOPA than on placebo. This is consistent with findings
in patients with Parkinsons disease (a dopamine deficit disorder)
whose learning rates when on dopaminergic medication are higher
than when off their medication, albeit, in this instance, without any
significant difference in overall performance
11
. As in that study
11
, it
is impossible to make a definitive distinction between learning rate,
the magnitude of the prediction error that arises from learning and
the stochastic way that learning leads to choice.
There are two important points in each trial at which a tempo-
ral difference error type signal can be anticipated, namely at choice,
when the temporal difference error is the expected value of the chosen
option, and at the time of outcome, when the temporal difference
error is the difference between the reward actually provided and the
expected value. Decomposing the outcome signal into these separate
positive and negative components is important because the response
to reward is highly correlated with the full prediction error, potentially
readily confusing the two
19,21,24
. Overall, in our experiment, under
placebo, although the representation of the actual reward appeared to
be normal, neither of the components of the expected value signal at
choice or outcome was present in nucleus accumbens BOLD signal.
This absence is consistent with the few behavioral
23
and neuroimag-
ing studies
16,17
that have suggested that older adults, on average, have
abnormal expected value representations, although it is important
to note that we did not find a substantial behavioral impairment.
Notably, we found that, under L-DOPA, both components of the
expected value signal were restored. However, a closer inspection,
taking individual differences in drug-induced effects on performance
into account, revealed that this was only the case for those older adults
whose performance improved under L-DOPA.
There are at least two possible explanations for the absence of the
expected value signal. One is that a putative model-free decision-
making system, closely associated with neuromodulatory effects
3,25
,
is impaired. This would render reward-based behavior subject to
the operation of a model-based system, which is thought to be less
dependent on dopaminergic transmission
26
. This possibility is sup-
ported by evidence that older adults perform better than younger
adults in tasks requiring a model of the environment (for exam-
ple, where future outcomes are dependent on previous choices)
27
.
Reconciling it with the observation that suppressing
28
or boosting
29
dopamine in healthy young volunteers suppresses or boosts, respec-
tively, model-based over model-free control is more of a challenge.
In relation to this point, we identified two parietal clusters where
expected value representations correlated with task performance in
the L-DOPA condition. Notably, these clusters overlap with regions
purported to signal state prediction errors
30
. One possibility is that
these regions may be a neural signature of model-based calculations,
which have also recently been shown to be enhanced by L-DOPA
in young participants
29
. Although previous studies have shown
dopaminergic modulation of value representations in the prefrontal
cortex
31
, we did not find strong evidence for the involvement of any
other extra-striatal regions implicated in the effects of L-DOPA on
reward processing in our sample of older adults. However, L-DOPA
may have also influenced other extra-striatal learning mechanisms
in our task. For example, episodic learning mediated by the hippo-
campus has also been linked to the dopaminergic system
32
and could
support aspects of rapid learning when it occurs.
Another possibility for the absence of a model-free expected value
signal is that it is still calculated normally, but that when dopamine
levels are low, it is not manifest in nucleus accumbens BOLD signal.
One can reasonably expect that dopamine levels will affect the state of
striatal neurons
33
. However, dopamine effects on local activity in the
striatum as well as on the BOLD signal of cortical and dopaminergic
inputs to the striatum remain unclear. In the future, it would be inter-
esting to use procedures based on recent studies (for example, see refs.
34,35) in older participants with and without L-DOPA to investigate
the balance of model-free and model-based control.
Enriching the above picture are recent studies in healthy young
participants showing that at least some aspects of the representation
in striatal BOLD of the expected value component of the temporal dif-
ference error are conditional on a requirement for action. In one such
study, the representation of expected value in young adults was not
modulated by L-DOPA
36
. However, it is not clear whether this is an
effect of the more extensive training provided in that study (which can
render behaviors insensitive to dopamine manipulations
37
) or that the
expected value did not fluctuate in a way that was relevant for choice.
Together with recent findings
38
, these results raise the possibility that
dopamine might only modulate the neural representation of expected
value when it is behaviorally relevant for the task at hand.
Our DTI connectivity analysis supports the notion that neuronal
representations of expected value, and thus appropriate RPE signal-
ing, rely on the integrity of the dopaminergic system. The connectivity
strength of tracts is one DTI metric that has been reported to pre-
dict age-related performance differences
39,40
. Notably, older adults
who performed the task under placebo as well as young adults had
higher nigro-striatal connectivity strength than older adults with
lower baseline levels of task performance. Furthermore, older indi-
viduals with stronger connectivity between SN/VTA and striatum
had more robust value representations in the nucleus accumbens.
Although our findings can be interpreted in the context of a well-
defined decline of nigro-striatal dopamine neurons with increas-
ing age
12,13
, we acknowledge that DTI measures of connectivity are
not a direct mapping of dopamine neurons, but instead reflect white
matter tract strength between the SN/VTA and striatum. In addition,
the direction of information flow cannot be inferred from DTI-based
tractography
41
. We did not observe a relationship between fractional
anisotropy of either the SN/VTA or striatum with functional activity in
the accumbens. Fractional anisotropy values characterize the extent of
water diffusion, thereby providing an indirect measure of myelin, axons
and the structural organization of both gray and white matter
15,42
. Our
results are therefore an indication that inter-individual anatomical
npg
© 2013 Nature America, Inc. All rights reserved.

Citations
More filters
Journal ArticleDOI
22 Oct 2014-Neuron
TL;DR: Findings suggest a link between the mechanisms supporting extrinsic reward motivation and intrinsic curiosity and highlight the importance of stimulating curiosity to create more effective learning experiences.

420 citations


Cites background from "Dopamine restores reward prediction..."

  • ...Implications The present findings have potential implications for understanding memory deficits in the elderly and in patients with psychiatric and neurological disorders that affect dopaminergic transmission (Chowdhury et al., 2013; Düzel et al., 2010; Goto and Grace, 2008; Lisman et al., 2011)....

    [...]

  • ...The present findings have potential implications for understanding memory deficits in the elderly and in patients with psychiatric and neurological disorders that affect dopaminergic transmission (Chowdhury et al., 2013; Düzel et al., 2010; Goto and Grace, 2008; Lisman et al., 2011)....

    [...]

Journal ArticleDOI
19 Jun 2013
TL;DR: Reward-related learning reflected at least two partially separable contributions related to phasic prediction error signalling, and was preferentially modulated by a low dose of the dopamine agonist pramipexole in MDD and anhedonia.
Abstract: Background: Depression is characterised partly by blunted reactions to reward. However, tasks probing this deficiency have not distinguished insensitivity to reward from insensitivity to the prediction errors for reward that determine learning and are putatively reported by the phasic activity of dopamine neurons. We attempted to disentangle these factors with respect to anhedonia in the context of stress, Major Depressive Disorder (MDD), Bipolar Disorder (BPD) and a dopaminergic challenge. Methods: Six behavioural datasets involving 392 experimental sessions were subjected to a model-based, Bayesian meta-analysis. Participants across all six studies performed a probabilistic reward task that used an asymmetric reinforcement schedule to assess reward learning. Healthy controls were tested under baseline conditions, stress or after receiving the dopamine D2 agonist pramipexole. In addition, participants with current or past MDD or BPD were evaluated. Reinforcement learning models isolated the contributions of variation in reward sensitivity and learning rate. Results: MDD and anhedonia reduced reward sensitivity more than they affected the learning rate, while a low dose of the dopamine D2 agonist pramipexole showed the opposite pattern. Stress led to a pattern consistent with a mixed effect on reward sensitivity and learning rate. Conclusion: Reward-related learning reflected at least two partially separable contributions. The first related to phasic prediction error signalling, and was preferentially modulated by a low dose of the dopamine agonist pramipexole. The second related directly to reward sensitivity, and was preferentially reduced in MDD and anhedonia. Stress altered both components. Collectively, these findings highlight the contribution of model-based reinforcement learning meta-analysis for dissecting anhedonic behavior.

354 citations


Cites result from "Dopamine restores reward prediction..."

  • ...Thus, our findings echo those of a recent study showing that dopamine agonists can reinstate prediction errors by restoring the component of the prediction error related to expected value, rather than that of reward [84]....

    [...]

Journal ArticleDOI
TL;DR: Examination of the neural mechanisms underlying apathy and anhedonia within a transdiagnostic framework of effort-based decision making for reward suggests that there might be some shared mechanisms between both syndromes.
Abstract: Apathy and anhedonia are common syndromes of motivation that are associated with a wide range of brain disorders and have no established therapies. Research using animal models suggests that a useful framework for understanding motivated behaviour lies in effort-based decision making for reward. The neurobiological mechanisms underpinning such decisions have now begun to be determined in individuals with apathy or anhedonia, providing an important foundation for developing new treatments. The findings suggest that there might be some shared mechanisms between both syndromes. A transdiagnostic approach that cuts across traditional disease boundaries provides a potentially useful means for understanding these conditions.

330 citations

Journal ArticleDOI
TL;DR: This review highlights some of the key challenges in analyzing and interpreting in vivo connectomics data, particularly in relation to what is known from classical neuroanatomy in laboratory animals and illustrates that, despite the challenges, in vivo imaging methods can be very powerful and provide information on connections that is not available by any other means.
Abstract: Decades of detailed anatomical tracer studies in non-human animals point to a rich and complex organization of long-range white matter connections in the brain. State-of-the art in vivo imaging techniques are striving to achieve a similar level of detail in humans, but multiple technical factors can limit their sensitivity and fidelity. In this review, we mostly focus on magnetic resonance imaging of the brain. We highlight some of the key challenges in analyzing and interpreting in vivo connectomics data, particularly in relation to what is known from classical neuroanatomy in laboratory animals. We further illustrate that, despite the challenges, in vivo imaging methods can be very powerful and provide information on connections that is not available by any other means.

300 citations

Journal ArticleDOI
31 Oct 2014-Science
TL;DR: Interventions help to identify contexts and mechanisms of successful cognitive aging and give science and society a hint about what would be possible if conditions were different.
Abstract: Human cognitive aging differs between and is malleable within individuals. In the absence of a strong genetic program, it is open to a host of hazards, such as vascular conditions, metabolic syndrome, and chronic stress, but also open to protective and enhancing factors, such as experience-dependent cognitive plasticity. Longitudinal studies suggest that leading an intellectually challenging, physically active, and socially engaged life may mitigate losses and consolidate gains. Interventions help to identify contexts and mechanisms of successful cognitive aging and give science and society a hint about what would be possible if conditions were different.

288 citations

References
More filters
Journal ArticleDOI
14 Mar 1997-Science
TL;DR: Findings in this work indicate that dopaminergic neurons in the primate whose fluctuating output apparently signals changes or errors in the predictions of future salient and rewarding events can be understood through quantitative theories of adaptive optimizing control.
Abstract: The capacity to predict future events permits a creature to detect, model, and manipulate the causal structure of its interactions with its environment. Behavioral experiments suggest that learning is driven by changes in the expectations about future salient events such as rewards and punishments. Physiological work has recently complemented these studies by identifying dopaminergic neurons in the primate whose fluctuating output apparently signals changes or errors in the predictions of future salient and rewarding events. Taken together, these findings can be understood through quantitative theories of adaptive optimizing control.

8,163 citations

Journal ArticleDOI
TL;DR: DARTEL has been applied to intersubject registration of 471 whole brain images, and the resulting deformations were evaluated in terms of how well they encode the shape information necessary to separate male and female subjects and to predict the ages of the subjects.

6,999 citations

Journal ArticleDOI
01 Oct 1991-Brain
TL;DR: It is suggested that age-related attrition of pigmented nigral cells is not an important factor in the pathogenesis of Parkinson's disease and the regional selectivity of PD is relatively specific.
Abstract: The micro-architecture of the substantia nigra was studied in control cases of varying age and patients with parkinsonism. A single 7 mu section stained with haematoxylin and eosin was examined at a specific level within the caudal nigra using strict criteria. The pars compacta was divided into a ventral and a dorsal tier, and each tier was further subdivided into 3 regions. In 36 control cases there was a linear fallout of pigmented neurons with advancing age in the pars compacta of the caudal substantia nigra at a rate of 4.7% per decade. Regionally, the lateral ventral tier was relatively spared (2.1% loss per decade) compared with the medial ventral tier (5.4%) and the dorsal tier (6.9%). In 20 Parkinson's disease (PD) cases of varying disease duration there was an exponential loss of pigmented neurons with a 45% loss in the first decade. Regionally, the pattern was opposite to ageing. Loss was greatest in the lateral ventral tier (average loss 91%) followed by the medial ventral tier (71%) and the dorsal tier (56%). The presymptomatic phase of PD from the onset of neuronal loss was estimated to be about 5 yrs. This phase is represented by incidental Lewy body cases: individuals who die without clinical signs of PD or dementia, but who are found to have Lewy bodies at post-mortem. In 7 cases cell loss was confined to the lateral ventral tier (average loss 52%) congruent with the lateral ventral selectivity of symptomatic PD. It was calculated that at the onset of symptoms there was a 68% cell loss in the lateral ventral tier and a 48% loss in the caudal nigra as a whole. The regional selectivity of PD is relatively specific. In 15 cases of striatonigral degeneration the distribution of cell loss was similar, but the loss in the dorsal tier was greater than PD by 21%. In 14 cases of Steele-Richardson-Olszewski syndrome (SRO) there was no predilection for the lateral ventral tier, but a tendency to involve the medial nigra and spare the lateral. These findings suggest that age-related attrition of pigmented nigral cells is not an important factor in the pathogenesis of PD.

3,181 citations

Journal Article
TL;DR: A toolbox called MarsBar is implemented for region of interest analysis within the SPM99 software package, which may have many advantages in terms of statistical power and the ease of interpretation of neuroimaging data.

2,987 citations

Journal ArticleDOI
TL;DR: This work suggests acquiring two images for each diffusion gradient; one with bottom-up and one with top-down traversal of k-space in the phase-encode direction, which achieves the simultaneous goals of providing information on the underlying displacement field and intensity maps with adequate spatial sampling density even in distorted areas.

2,409 citations

Frequently Asked Questions (12)
Q1. What are the contributions in "Dopamine restores reward prediction errors in old age" ?

In humans, there is compelling evidence that functional activation patterns in the nucleus accumbens, a major target region of dopamine neurons5, report rewarding outcomes and associated prediction errors6–9. The authors studied the effect of probabilistic rewarding outcomes on the separate reward and prediction components of a prediction error signal19 in healthy older adults. Older adults underwent DTI and functional magnetic resonance imaging ( fMRI ) in combination with a pharmacological manipulation using the dopamine precursor L-DOPA in a withinsubject, double-blind, placebo-controlled study. The authors collected behavioral data in a group of young adults to contextualize the effects of age on performance. By exploiting a reinforcement learning model, the authors were able to determine which component of the prediction error ( the actual and/or expected reward representation ) was impaired in older age. Thus, the authors predicted that L-DOPA would increase the learning rate evident in behavior as well as boost the representation of an RPE in the nucleus accumbens of healthy older adults, specifically by increasing the component associated with the expected value. This has led to the suggestion that, although older adults may maintain adequate representations of reward, they are unable to learn correctly from these representations. 

This possibility is supported by evidence that older adults perform better than younger adults in tasks requiring a model of the environment ( for example, where future outcomes are dependent on previous choices ) 27. In the future, it would be interesting to use procedures based on recent studies ( for example, see refs. In summary, their findings suggest that a subgroup of older adults who underperform at baseline can show a drug-induced improvement in task performance. On the other hand, participants that performed better on the task under placebo ( that is, on a par with young controls ) had a greater representation of expected value in the striatum and stronger nigro-striatal connectivity, suggesting higher baseline dopamine status. 

The general linear model for each subject at the first level consisted of regressors at the time of stimulus display separately for when a choice was made, when no choice was made and at the time of stimulus outcome. 

For these older adults, L-DOPA increased a task-based learning rate and led to a canonical RPE signal by restoring the representation of expected value in the nucleus accumbens. 

The first seven reference images were acquired with a b value of 100 s mm−2 (low b images) and the remaining 61 images with a b value of 1,000 s mm−2 (ref. 48). 

Another possibility for the absence of a model-free expected value signal is that it is still calculated normally, but that when dopamine levels are low, it is not manifest in nucleus accumbens BOLD signal. 

37. Choi, W.Y., Balsam, P.D. & Horvitz, J.C. Extended habit training reduces dopamine mediation of appetitive response expression. 

This suggests that, for participants with high baseline levels of performance, L-DOPA increased noise in their reward and expected value representations and this was associated with a worsening in performance. 

The probabilities of obtaining a reward for each stimulus were independent of each other and varied on a trial-to-trial basis according to a Gaussian random walk, generated using a previously described procedure8. 

All right sre serv ed.650 VOLUME 16 | NUMBER 5 | MAY 2013 nature neurOSCIenCea r t The authorC l e S(Supplementary Fig. 2), the authors first defined voxels in the nucleus accumbens that signaled a ‘putative’ prediction error, namely voxels in which there was an enhanced response at the time of outcome to actual rewards that was greater than that to expected rewards (R(t) > Qa(t)(t); see Online Methods). 

Note that this is a liberal definition of RPEs, as voxels showing a significant effect with this contrast may not satisfy all of the criteria to be considered for a canonical RPE, namely both a positive effect of reward and a negative effect of expected value19,21. 

although the SPM model included regressors at the time of the choice and time of the outcome, the authors only included parametric modulators at the time of the outcome, thereby focusing on just outcome prediction errors.