Dopamine restores reward prediction errors in old age
Summary (2 min read)
Introduction
- Older adults are particularly poor at making decisions when faced with probabilistic rewards, possibly because of impaired learning of stimulus-outcome contingencies1,2.
- Thus, the authors predicted that L-DOPA would increase the learning rate evident in behavior as well as boost the representation of an RPE in the nucleus accumbens of healthy older adults, specifically by increasing the component associated with the expected value.
- Dopamine restores reward prediction errors in old age.
Reinforcement learning behavior
- The authors analyzed trial-by-trial choice behavior using a standard reinforcement learning model with a fixed β parameter (Fig. 2a).
- (a) On each trial, participants selected one of two fractal images, which were then highlighted in a red frame.
- Error bars represent ±1 s.e.m. (c) Older adults who won more on L-DOPA than placebo (n = 15) had a significantly higher learning rate under L-DOPA than placebo, whereas learning rates did not differ between placebo and L-DOPA for older adults who won less on L-DOPA than placebo (n = 17).
- Error bars represent ±1 s.e.m. (a) A region in the right nucleus accumbens showed greater BOLD activity for reward (R) than for expected value (Q) at the time of outcome (putative RPE).
- L-DOPA increased the negative effect of expected value (paired t test, **P < 0.05, two tailed), resulting in a canonical prediction error signal (both a positive effect of reward and negative effect of expected value).
Anatomical connectivity and RPEs
- The authors analysis identified substantial inter-individual variability among older adults for both reward and expected value representations in the nucleus accumbens at baseline (that is, under placebo; Supplementary Fig. 5), whereby the latter was associated with task performance.
- Using DTI and probabilistic tractography (n = 30 older adults), the authors defined a measure of connection strength between the right SN/VTA and right striatum (Supplementary Fig. 6 and Online Methods).
- Neither fractional anisotropy values of SN/VTA nor nucleus accumbens functional ROI correlated with expected value (Pearson’s r = 0.26 and r = 0.17, P = 0.16 and P = 0.38, respectively), suggesting that this correlation was related to circuit strength rather than to local structural integrity as determined by fractional anisotropy.
- This suggests that these older individuals had higher baseline integrity of the nigro-striatal dopamine circuit than older adults with lower baseline levels of performance.
DISCUSSION
- The authors used a probabilistic reinforcement learning task in combination with a pharmacological manipulation of dopamine, as well as structural np g © 2 01 3 N at ur e A m er ic a, In c.
- Together with recent findings38, these results raise the possibility that dopamine might only modulate the neural representation of expected value when it is behaviorally relevant for the task at hand.
- The connectivity strength of tracts is one DTI metric that has been reported to predict age-related performance differences39,40.
- In summary, their findings suggest that a subgroup of older adults who underperform at baseline can show a drug-induced improvement in task performance.
- For these older adults, L-DOPA increased a task-based learning rate and led to a canonical RPE signal by restoring the representation of expected value in the nucleus accumbens.
METhODS
- Methods and any associated references are available in the online version of the paper.
- Supplementary information is available in the online version of the paper, also known as Note.
AcknowledgmenTS
- The authors thank J. Medhora and L. Sasse for their assistance with data collection, and H. Barron and M. Klein-Flügge for their assistance with time course analyses.
- R.C. is supported by a Wellcome Trust Research Training Fellowship (WT088286MA).
- The Wellcome Trust Centre for Neuroimaging is supported by core funding from the Wellcome Trust (091593/Z/10/Z).
AUTHoR conTRIBUTIonS
- R.C. and M.G.-M. conducted the experiment, analyzed the data and prepared the manuscript.
- Eppinger, B., Hämmerer, D. & Li, S.-C. Neuromodulation of reward-based learning and decision making in human aging.
- Samanez-Larkin, G.R., Wagner, A.D. & Knutson, B. Expected value information improves financial risk taking across the adult life span.
- Jocham, G., Klein, T.A. & Ullsperger, M. Dopamine-mediated reinforcement learning signals in the striatum and ventromedial prefrontal cortex underlie value-based choices.
ONLINE METhODS
- 32 healthy adults aged 65–75 years participated in the study (Supplementary Table 1).
- Separating the RPE into its components has often not been done, although recent studies have shown that, as R(t) is highly correlated with the RPE, a correlation between BOLD signal and R(t) – Qa(t)(t) may lead to false positive results suggesting that areas whose BOLD signal only correlates with R(t) may be thought of as representing an RPE.
- The authors also performed separate multiple regression analyses for the same contrasts on the placebo and L-DOPA conditions separately using task performance (total won) as a between-subjects regressor.
- The main aim of this analysis was to visualize the effect of reward and expected value on the BOLD signal, at the time of the choice and at the time of the outcome, from the nucleus accumbens functional ROI over the course of a trial.
Did you find this useful? Give us your feedback
Citations
420 citations
Cites background from "Dopamine restores reward prediction..."
...Implications The present findings have potential implications for understanding memory deficits in the elderly and in patients with psychiatric and neurological disorders that affect dopaminergic transmission (Chowdhury et al., 2013; Düzel et al., 2010; Goto and Grace, 2008; Lisman et al., 2011)....
[...]
...The present findings have potential implications for understanding memory deficits in the elderly and in patients with psychiatric and neurological disorders that affect dopaminergic transmission (Chowdhury et al., 2013; Düzel et al., 2010; Goto and Grace, 2008; Lisman et al., 2011)....
[...]
354 citations
Cites result from "Dopamine restores reward prediction..."
...Thus, our findings echo those of a recent study showing that dopamine agonists can reinstate prediction errors by restoring the component of the prediction error related to expected value, rather than that of reward [84]....
[...]
330 citations
300 citations
288 citations
References
8,163 citations
6,999 citations
3,181 citations
2,987 citations
2,409 citations
Related Papers (5)
Frequently Asked Questions (12)
Q2. What future works have the authors mentioned in the paper "Dopamine restores reward prediction errors in old age" ?
This possibility is supported by evidence that older adults perform better than younger adults in tasks requiring a model of the environment ( for example, where future outcomes are dependent on previous choices ) 27. In the future, it would be interesting to use procedures based on recent studies ( for example, see refs. In summary, their findings suggest that a subgroup of older adults who underperform at baseline can show a drug-induced improvement in task performance. On the other hand, participants that performed better on the task under placebo ( that is, on a par with young controls ) had a greater representation of expected value in the striatum and stronger nigro-striatal connectivity, suggesting higher baseline dopamine status.
Q3. What was the general linear model for each subject at the first level?
The general linear model for each subject at the first level consisted of regressors at the time of stimulus display separately for when a choice was made, when no choice was made and at the time of stimulus outcome.
Q4. What is the effect of L-DOPA on the learning rate of older adults?
For these older adults, L-DOPA increased a task-based learning rate and led to a canonical RPE signal by restoring the representation of expected value in the nucleus accumbens.
Q5. How many b values were used to acquire the first seven reference images?
The first seven reference images were acquired with a b value of 100 s mm−2 (low b images) and the remaining 61 images with a b value of 1,000 s mm−2 (ref. 48).
Q6. What is the likely explanation for the absence of a model-free expected value signal?
Another possibility for the absence of a model-free expected value signal is that it is still calculated normally, but that when dopamine levels are low, it is not manifest in nucleus accumbens BOLD signal.
Q7. what is the effect of habit training on dopaminergic reactivity?
37. Choi, W.Y., Balsam, P.D. & Horvitz, J.C. Extended habit training reduces dopamine mediation of appetitive response expression.
Q8. What is the effect of L-DOPA on performance?
This suggests that, for participants with high baseline levels of performance, L-DOPA increased noise in their reward and expected value representations and this was associated with a worsening in performance.
Q9. What was the probability of obtaining a reward for each stimulus?
The probabilities of obtaining a reward for each stimulus were independent of each other and varied on a trial-to-trial basis according to a Gaussian random walk, generated using a previously described procedure8.
Q10. What was the voxel in the nucleus accumbens?
All right sre serv ed.650 VOLUME 16 | NUMBER 5 | MAY 2013 nature neurOSCIenCea r t The authorC l e S(Supplementary Fig. 2), the authors first defined voxels in the nucleus accumbens that signaled a ‘putative’ prediction error, namely voxels in which there was an enhanced response at the time of outcome to actual rewards that was greater than that to expected rewards (R(t) > Qa(t)(t); see Online Methods).
Q11. What is the definition of a canonical RPE?
Note that this is a liberal definition of RPEs, as voxels showing a significant effect with this contrast may not satisfy all of the criteria to be considered for a canonical RPE, namely both a positive effect of reward and a negative effect of expected value19,21.
Q12. Why did the authors not include parametric modulators at the time of the outcome?
although the SPM model included regressors at the time of the choice and time of the outcome, the authors only included parametric modulators at the time of the outcome, thereby focusing on just outcome prediction errors.