Learning to represent reward structure: a key to adapting to complex environments.
Citations
441 citations
196 citations
155 citations
132 citations
Cites background from "Learning to represent reward struct..."
...However, the community has recently begun to appreciate the importance of the underlying reward structure (Nakahara and Hikosaka 2012)....
[...]
113 citations
References
37,989 citations
"Learning to represent reward struct..." refers background in this paper
...007 research (Sutton and Barto, 1990) and remains an active research area in computer science and machine learning (Sutton and Barto, 1998)....
[...]
...64 65 66 67 168-0102/$ – see front matter © 2012 Elsevier Ireland Ltd and the Japan Neuroscience S ttp://dx.doi.org/10.1016/j.neures.2012.09.007 research (Sutton and Barto, 1990) and remains an active research area in computer science and machine learning (Sutton and Barto, 1998)....
[...]
8,163 citations
"Learning to represent reward struct..." refers background in this paper
...neur he value-based decision making process and the underlying neual mechanisms (Montague et al., 1996; Schultz et al., 1997)....
[...]
...A marked example is an ingenious hypothesis about dopamine phasic activity as a learning signal for TD learning (called TD error), which is the strongest example of mapping to date, and is thus a critical driving force behind the progress in this field (Barto, 1994; Houk et al., 1994; Montague et al., 1996; Schultz et al., 1997)....
[...]
...…hypothesis about dopamine phasic activity as a learning signal for TD learning (called TD error), which is the strongest example of mapping to date, and is thus a critical driving force behind the progress in this field (Barto, 1994; Houk et al., 1994; Montague et al., 1996; Schultz et al., 1997)....
[...]
...This transparent mapping has helped to drive the field’s progress since the proposal of this hypothesis, and it has been observed as the correspondence between “canonical” DA responses and the TD error of the hypothesis (Schultz et al., 1997)....
[...]
...(2012), http://dx.doi.org/10.1016/j.neur he value-based decision making process and the underlying neual mechanisms (Montague et al., 1996; Schultz et al., 1997)....
[...]
3,962 citations
2,171 citations
1,950 citations
"Learning to represent reward struct..." refers background in this paper
...Indeed, DA activity is also shown to encode “uncertainty” signals (Fiorillo et al., 2003) or “information-seeking” signals (Bromberg-Martin and Hikosaka, 2009)....
[...]