scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Medial prefrontal cortex as an action-outcome predictor

01 Oct 2011-Nature Neuroscience (Nature Publishing Group)-Vol. 14, Iss: 10, pp 1338-1344
TL;DR: It is shown that a simple model based on standard learning rules can simulate and unify an unprecedented range of known effects in mPFC, and suggests a new view of the medial prefrontal cortex, as a region concerned with learning and predicting the likely outcomes of actions, whether good or bad.
Abstract: The medial prefrontal cortex (mPFC) and especially anterior cingulate cortex is central to higher cognitive function and many clinical disorders, yet its basic function remains in dispute. Various competing theories of mPFC have treated effects of errors, conflict, error likelihood, volatility and reward, using findings from neuroimaging and neurophysiology in humans and monkeys. No single theory has been able to reconcile and account for the variety of findings. Here we show that a simple model based on standard learning rules can simulate and unify an unprecedented range of known effects in mPFC. The model reinterprets many known effects and suggests a new view of mPFC, as a region concerned with learning and predicting the likely outcomes of actions, whether good or bad. Cognitive control at the neural level is then seen as a result of evaluating the probable and actual outcomes of one's actions.
Citations
More filters
Journal ArticleDOI
TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.
Abstract: Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).

13,246 citations

Journal ArticleDOI
24 Jul 2013-Neuron
TL;DR: This work presents a normative model of EVC that integrates three critical factors: the expected payoff from a controlled process, the amount of control that must be invested to achieve that payoff, and the cost in terms of cognitive effort.

1,625 citations


Cites background from "Medial prefrontal cortex as an acti..."

  • ...…elevated dACC activity in response to surprising outcomes (Cavanagh et al., 2012; Landmann et al., 2007; Nee et al., 2011; Wessel et al., 2012) and, more generally, following unanticipated shifts in task contingencies (Alexander and Brown, 2011; Behrens et al., 2007; Bland and Schaefer, 2011)....

    [...]

  • ...These observations have inspired a recent model of dACC function by Alexander and Brown (2011), which suggests that dACC stores predicted associations between stimuli and responseoutcome (RO) conjunctions, and signals any violations of these predicted S-RO relationships....

    [...]

Journal ArticleDOI
20 Dec 2012-Neuron
TL;DR: It is proposed that the function of the medial prefrontal cortex is to learn associations between context, locations, events, and corresponding adaptive responses, particularly emotional responses, and that mPFC likely relies on the hippocampus to support rapid learning and memory consolidation.

1,153 citations


Cites background from "Medial prefrontal cortex as an acti..."

  • ...Alternatively, it has been suggested that the mPFC compares actual and expected outcomes and computes the degree of expectancy violation (i.e., ‘‘surprise’’) (Alexander and Brown, 2011)....

    [...]

  • ...As has been previously suggested, the mPFC likely forms and stores schema which map context and events onto appropriate actions (Alexander and Brown, 2011; Miller and Cohen, 2001)....

    [...]

Journal ArticleDOI
TL;DR: This model integrates a broad range of previously disparate evidence, makes predictions for conjoint manipulations of agency and presence, offers a new view of emotion as interoceptive inference, and represents a step toward a mechanistic account of a fundamental phenomenological property of consciousness.
Abstract: We describe a theoretical model of the neurocognitive mechanisms underlying conscious presence and its disturbances. The model is based on interoceptive prediction error and is informed by predictive models of agency, general models of hierarchical predictive coding and dopaminergic signaling in cortex, the role of the anterior insular cortex (AIC) in interoception and emotion, and cognitive neuroscience evidence from studies of virtual reality and of psychiatric disorders of presence, specifically depersonalization/derealization disorder. The model associates presence with successful suppression by top-down predictions of informative interoceptive signals evoked by autonomic control signals and, indirectly, by visceral responses to afferent sensory signals. The model connects presence to agency by allowing that predicted interoceptive signals will depend on whether afferent sensory signals are determined, by a parallel predictive-coding mechanism, to be self-generated or externally caused. Anatomically, we identify the AIC as the likely locus of key neural comparator mechanisms. Our model integrates a broad range of previously disparate evidence, makes predictions for conjoint manipulations of agency and presence, offers a new view of emotion as interoceptive inference, and represents a step toward a mechanistic account of a fundamental phenomenological property of consciousness.

687 citations

References
More filters
Journal ArticleDOI
TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.
Abstract: Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).

13,246 citations

Journal ArticleDOI
TL;DR: This paper presents and proves in detail a convergence theorem forQ-learning based on that outlined in Watkins (1989), showing that Q-learning converges to the optimum action-values with probability 1 so long as all actions are repeatedly sampled in all states and the action- values are represented discretely.
Abstract: \cal Q-learning (Watkins, 1989) is a simple way for agents to learn how to act optimally in controlled Markovian domains. It amounts to an incremental method for dynamic programming which imposes limited computational demands. It works by successively improving its evaluations of the quality of particular actions at particular states. This paper presents and proves in detail a convergence theorem for \cal Q-learning based on that outlined in Watkins (1989). We show that \cal Q-learning converges to the optimum action-values with probability 1 so long as all actions are repeatedly sampled in all states and the action-values are represented discretely. We also sketch extensions to the cases of non-discounted, but absorbing, Markov environments, and where many \cal Q values can be changed each iteration, rather than just one.

8,450 citations

Journal ArticleDOI
TL;DR: Two computational modeling studies are reported, serving to articulate the conflict monitoring hypothesis and examine its implications, including a feedback loop connecting conflict monitoring to cognitive control, and a number of important behavioral phenomena.
Abstract: A neglected question regarding cognitive control is how control processes might detect situations calling for their involvement. The authors propose here that the demand for control may be evaluated in part by monitoring for conflicts in information processing. This hypothesis is supported by data concerning the anterior cingulate cortex, a brain area involved in cognitive control, which also appears to respond to the occurrence of conflict. The present article reports two computational modeling studies, serving to articulate the conflict monitoring hypothesis and examine its implications. The first study tests the sufficiency of the hypothesis to account for brain activation data, applying a measure of conflict to existing models of tasks shown to engage the anterior cingulate. The second study implements a feedback loop connecting conflict monitoring to cognitive control, using this to simulate a number of important behavioral phenomena.

6,385 citations


"Medial prefrontal cortex as an acti..." refers background in this paper

  • ...In essence, equation (5) specifies a vectorvalued temporal difference model that learns a prediction proportional to the likelihood of a given response-outcome conjunction at a given time....

    [...]

Journal ArticleDOI
TL;DR: In this paper, a 1-sec tachistoscopic exposure, Ss responded with a right or left leverpress to a single target letter from the sets H and K or S and C. The target always appeared directly above the fixation cross.
Abstract: During a 1-sec tachistoscopic exposure, Ss responded with a right or left leverpress to a single target letter from the sets H and K or S and C. The target always appeared directly above the fixation cross. Experimentally varied were the types of noise letters (response compatible or incompatible) flanking the target and the spacing between the letters in the display. In all noise conditions, reaction time (RT) decreased as between-letter spacing increased. However, noise letters of the opposite response set were found to impair RT significantly more than same response set noise, while mixed noise letters belonging to neither set but having set-related features produced intermediate impairment. Differences between two target-alone control conditions, one presented intermixed with noise-condition trials and one presented separately in blocks, gave evidence of a preparatory set on the part of Ss to inhibit responses to the noise letters. It was concluded that S cannot prevent processing of noise letters occurring within about 1 deg of the target due to the nature of processing channel capacity and must inhibit his response until he is able to discriminate exactly which letter is in the target position. This discrimination is more difficult and time consuming at closer spacings, and inhibition is more difficult when noise letters indicate the opposite response from the targe

6,234 citations


"Medial prefrontal cortex as an acti..." refers background or methods in this paper

  • ...The role of S is to provide an immediate prediction of the likely outcomes of actions and inhibit those that are predicted to yield an undesirable outcome (see equation (13))....

    [...]

  • ...The model architecture remained the same as in simulation 1 except that lateral inhibition between response units (equation (13)) was removed to allow simultaneous generation of response....

    [...]

Journal ArticleDOI
TL;DR: This paper presented a unified account of two neural systems concerned with the development and expression of adaptive behaviors: a mesencephalic dopamine system for reinforcement learning and a generic error-processing system associated with the anterior cingulate cortex.
Abstract: The authors present a unified account of 2 neural systems concerned with the development and expression of adaptive behaviors: a mesencephalic dopamine system for reinforcement learning and a “generic” error-processing system associated with the anterior cingulate cortex The existence of the error-processing system has been inferred from the error-related negativity (ERN), a component of the event-related brain potential elicited when human participants commit errors in reaction-time tasks The authors propose that the ERN is generated when a negative reinforcement learning signal is conveyed to the anterior cingulate cortex via the mesencephalic dopamine system and that this signal is used by the anterior cingulate cortex to modify performance on the task at hand They provide support for this proposal using both computational modeling and psychophysiological experimentation

3,438 citations