scispace - formally typeset
Search or ask a question
Book ChapterDOI

Understanding the role of serotonin in basal ganglia through a unified model

TL;DR: A Reinforcement Learning (RL)-based model of serotonin which tries to reconcile some of the diverse roles of the neuromodulator is presented, which uses a novel formulation of utility function, which is a weighted sum of the traditional value function and the risk function.
Abstract: We present a Reinforcement Learning (RL)-based model of serotonin which tries to reconcile some of the diverse roles of the neuromodulator. The proposed model uses a novel formulation of utility function, which is a weighted sum of the traditional value function and the risk function. Serotonin is represented by the weightage, α, used in this combination. The model is applied to three different experimental paradigms: 1) bee foraging behavior, which involves decision making based on risk, 2) temporal reward prediction task, in which serotonin (α) controls the time-scale of reward prediction, and 3) reward/punishment prediction task, in which punishment prediction error depends on serotonin levels. The three diverse roles of serotonin --- in time-scale of reward prediction, risk modeling, and punishment prediction --- is explained within a single framework by the model.
Citations
More filters
Journal ArticleDOI
TL;DR: The model is the first model of PG in PD conditions modeled using Reinforcement Learning with the significant difference that the action selection is performed using utility distribution instead of using purely Value-based distribution, thereby incorporating risk-based decision making.
Abstract: We propose a computational model of Precision Grip (PG) performance in normal subjects and Parkinson’s Disease (PD) patients. Prior studies on grip force generation in PD patients show an increase in grip force during ON medication and an increase in the variability of the grip force during OFF medication (Fellows et al 1998; Ingvarsson et al 1997). Changes in grip force generation in dopamine-deficient PD conditions strongly suggest contribution of the Basal Ganglia, a deep brain system having a crucial role in translating dopamine signals to decision making. The present approach is to treat the problem of modeling grip force generation as a problem of action selection, which is one of the key functions of the Basal Ganglia. The model consists of two components: 1) the sensory-motor loop component, and 2) the Basal Ganglia component. The sensory-motor loop component converts a reference position and a reference grip force, into lift force and grip force profiles, respectively. These two forces cooperate in grip-lifting a load. The sensory-motor loop component also includes a plant model that represents the interaction between two fingers involved in PG, and the object to be lifted. The Basal Ganglia component is modeled using Reinforcement Learning with the significant difference that the action selection is performed using utility distribution instead of using purely Value-based distribution, thereby incorporating risk-based decision making. The proposed model is able to account for the precision grip results from normal and PD patients accurately (Fellows et. al. 1998; Ingvarsson et. al. 1997). To our knowledge the model is the first model of precision grip in PD conditions.

22 citations

Book ChapterDOI
01 Jan 2018
TL;DR: This chapter argues that describing the two BG pathways as having mutually opponent actions has limitations and argues that the BG indirect pathway also plays a role in exploration, which is used to simulate various processes of the basal ganglia.
Abstract: One of the earliest attempts at building a theory of the basal ganglia (BG) is based on the clinical findings that lesions to the direct and indirect pathways of the BG produce quite opposite motor manifestations (Albin et al., in Trends Neurosci 12(10):366–375, 1989). While lesions of the direct pathway (DP), affecting particularly the projections from the striatum to GPi, are associated with hypokinetic disorders (distinguished by a paucity of movement), lesions of the indirect pathway (IP) produce hyperkinetic disorders, such as chorea and tremor. In this chapter, we argue that describing the two BG pathways as having mutually opponent actions has limitations. We argue that the BG indirect pathway also plays a role in exploration. We should evidence from various motor learning and decision-making tasks that exploration is a necessary process in various behavioral processes. Importantly, we use the exploration mechanism explained here to simulate various processes of the basal ganglia which we discuss in the following chapters.

13 citations

Book ChapterDOI
01 Jan 2018
TL;DR: This chapter presents an extended reinforcement learning (RL)-based model of DA and 5-HT function in the BG, which reconciles some of the diverse roles of 5- HT.
Abstract: In addition to dopaminergic input, serotonergic (5-HT) fibers also widely arborize through the basal ganglia circuits and strongly control their dynamics. Although empirical studies show that 5-HT plays many functional roles in risk-based decision making, reward, and punishment learning, prior computational models mostly focus on its role in behavioral inhibition or timescale of prediction. This chapter presents an extended reinforcement learning (RL)-based model of DA and 5-HT function in the BG, which reconciles some of the diverse roles of 5-HT. The model uses the concept of utility function—a weighted sum of the traditional value function expressing the expected sum of the rewards, and a risk function expressing the variance observed in reward outcomes. Serotonin is represented by a weight parameter, used in this combination of value and risk functions, while the neuromodulator dopamine (DA) is represented as reward prediction error as in the classical models. Consistent with this abstract model, a network model is also presented in which medium spiny neurons (MSN) co-expressing both D1 and D2 receptors (D1R–D2R) is suggested to compute risk, while those expressing only D1 receptors are suggested to compute value. This BG model includes nuclei such as striatum, Globus Pallidus externa, Globus Pallidus interna, and subthalamic nuclei. DA and 5-HT are modeled to affect both the direct pathway (DP) and the indirect pathway (IP) composing of D1R, D2R, D1R–D2R projections differentially. Both abstract and network models are applied to data from different experimental paradigms used to study the role of 5-HT: (1) risk-sensitive decision making, where 5-HT controls the risk sensitivity; (2) temporal reward prediction, where 5-HT controls timescale of reward prediction, and (3) reward–punishment sensitivity, where punishment prediction error depends on 5-HT levels. Both the extended RL model (Balasubramani, Chakravarthy, Ravindran, & Moustafa, in Front Comput Neurosci 8:47, 2014; Balasubramani, Ravindran, & Chakravarthy, in Understanding the role of serotonin in basal ganglia through a unified model, 2012) along with their network correlates (Balasubramani, Chakravarthy, Ravindran, & Moustafa, in Front Comput Neurosci 9:76, 2015; Balasubramani, Chakravarthy, Ali, Ravindran, & Moustafa, in PLoS ONE 10(6):e0127542, 2015) successfully explain the three diverse roles of 5-HT in a single framework.

6 citations

Journal ArticleDOI
12 Aug 2021
TL;DR: This model involves two critics, an optimistic learning system and a pessimistic learning system, whose predictions are integrated in time to control how potential decisions compete to be selected, and predicts that human decision-making can be decomposed along two dimensions.
Abstract: Recent experiments and theories of human decision-making suggest positive and negative errors are processed and encoded differently by serotonin and dopamine, with serotonin possibly serving to oppose dopamine and protect against risky decisions. We introduce a temporal difference (TD) model of human decision-making to account for these features. Our model involves two critics, an optimistic learning system and a pessimistic learning system, whose predictions are integrated in time to control how potential decisions compete to be selected. Our model predicts that human decision-making can be decomposed along two dimensions: the degree to which the individual is sensitive to (1) risk and (2) uncertainty. In addition, we demonstrate that the model can learn about the mean and standard deviation of rewards, and provide information about reaction time despite not modeling these variables directly. Lastly, we simulate a recent experiment to show how updates of the two learning systems could relate to dopamine and serotonin transients, thereby providing a mathematical formalism to serotonin’s hypothesized role as an opponent to dopamine. This new model should be useful for future experiments on human decision-making.

4 citations

Posted ContentDOI
11 Dec 2020-bioRxiv
TL;DR: This model involves two critics, an optimistic learning system and a pessimistic learning system, whose predictions are integrated in time to control how potential decisions compete to be selected, and predicts that human decision-making can be decomposed along two dimensions.
Abstract: Recent experiments and theories of human decision-making suggest positive and negative errors are processed and encoded differently by serotonin and dopamine, with serotonin possibly serving to oppose dopamine and protect against risky decisions. We introduce a temporal difference (TD) model of human decision-making to account for these features. Our model involves two opposing counsels, an optimistic learning system and a pessimistic learning system, whose predictions are integrated in time to control how potential decisions compete to be selected. Our model predicts that human decision-making can be decomposed along two dimensions: the degree to which the individual is sensitive to (1) risk and (2) uncertainty. In addition, we demonstrate that the model can learn about reward expectations and uncertainty, and provide information about reaction time despite not modeling these variables directly. Lastly, we simulate a recent experiment to show how updates of the two learning systems could relate to dopamine and serotonin transients, thereby providing a mathematical formalism to serotonin9s hypothesized role as an opponent to dopamine. This new model should be useful for future experiments on human decision-making.

2 citations


Cites background from "Understanding the role of serotonin..."

  • ...352 We are not the first to try to interpret observations of serotonin and dopamine through the lens of a 353 computational model [6, 18,30,40]....

    [...]