Hedonic value: enhancing adaptation for motivated agents
read more
Citations
Emotion in reinforcement learning agents and robots: a survey
Making New "New AI" Friends : Designing a Social Robot for Diabetic Children from an Embodied AI Perspective
References
The Ecological Approach to Visual Perception
Learning internal representations by error propagation
Parallel Distributed Processing: Explorations in the Microstructure of Cognition: Foundations
Learning internal representations by error propagation
Introduction to Reinforcement Learning
Related Papers (5)
Learning Affordances of Consummatory Behaviors: Motivation-Driven Adaptive Perception
Frequently Asked Questions (12)
Q2. What future works have the authors mentioned in the paper "Hedonic value: enhancing adaptation for motivated agents" ?
This paper has formulated a process of internal modulation of value as an additional mechanism to extend the agent ’ s adaptivity to difficult or changing environments. It shows that whenever the authors make reward value dependent on the motivational state and on previous experience about the environment, in their case recorded by the actor-critic policies, this modulation can exert a significant influence on the behavioural cycles generated and on the agent ’ s overall physiological stability. Although further study will be necessary to find specific methodologies to develop mechanisms of adaptation based on affective phenomena, the review and results presented here highlight that this kind of processes are called to play a significant role in behavioural adaptation.
Q3. What is the model of artificial physiology?
Their model of artificial physiology consists of a set of homeostatic, survival-related variables, a set of drives that depend on the internal variables and a repertoire of behaviours.
Q4. How has the adaptive value of the agent been assessed?
The adaptivity of the agent has been assessed in terms of its physiological stability (Ashby, 1965), as a function of the response to changes in the availability of resources of the environment.
Q5. How many behaviour executions are possible in the ideal scenario?
The shortest possible cycle length for the given ⌧ decay constant (10ms) is obtained in the ideal scenario: 11 behaviour executions.•
Q6. What is the main brain area specialised in the encoding of hedonic value?
The main brain area specialised in the encoding of hedonic value in an independent manner of behaviour is the Orbito-Frontal Cortex (OFC).
Q7. What is the percentage of mismatch between the greedy and the physiological policy?
The remaining 35% of mismatch may be distributed between the 20% excluded by the greedy policy (✏ = 0.2) and a 15% due to decisions of behaviours not match-14ing the affordance offered by the object nearby or because its physiological effect would on a homeostatic variable already sated.
Q8. How many times did the simulation run take?
The authors performed twenty simulation runs per condition, and recorded the time-course of the agent’s internal physiology and the encompassing behavioural cycles throughout this time.
Q9. How long does the cycle length exhibit in all cases?
As expected, the cycle length exhibits a gradual shortening in all cases, as the knowledge about the environment improves and the behavioural policy becomes increasingly effective.
Q10. What is the hedonic value of the neuro-inspired notion of subjective assessment?
their neuro-inspired notion of subjective assessment is implemented as a value function, and is tested by learning behavioural responses to different stimuli and physiological states in a manner compliant with the hypothesis of phasic dopamine as an error signal (Khamassi et al., 2005; McClure et al., 2003; Schultz et al., 2000; Houk3 et al., 1995).
Q11. What is the nature of value encoding?
the nature of value encoding is constrained by the need of implementing decisions, hence requiring the comparison across often dissimilar options.
Q12. Why is the agent endowed with a wandering behaviour?
the authors have first endowed their agent with a wandering behaviour to facilitate exploration and therefore object encountering.