Top 21 papers published by Matthew Botvinick from University College London in 2019

Journal Article•DOI•

[...]

Matthew Botvinick¹, Samuel Ritter², Jane X. Wang, Zeb Kurth-Nelson¹, Charles Blundell, Demis Hassabis¹ - Show less +2 more•Institutions (2)

University College London¹, Princeton University²

01 May 2019-Trends in Cognitive Sciences

TL;DR: This review describes recently developed techniques that allow deep RL to operate more nimbly, solving problems much more quickly than previous methods, and proposes that they may have rich implications for psychology and neuroscience.

...read moreread less

433 citations

Posted Content•

MONet: Unsupervised Scene Decomposition and Representation

[...]

Christopher P. Burgess, Loic Matthey, Nicholas Watters, Rishabh Kabra, Irina Higgins, Matthew Botvinick, Alexander Lerchner¹ - Show less +3 more•Institutions (1)

Google¹

22 Jan 2019-arXiv: Computer Vision and Pattern Recognition

TL;DR: The Multi-Object Network (MONet) is developed, which is capable of learning to decompose and represent challenging 3D scenes into semantically meaningful components, such as objects and background elements.

...read moreread less

Abstract: The ability to decompose scenes in terms of abstract building blocks is crucial for general intelligence. Where those basic building blocks share meaningful properties, interactions and other regularities across scenes, such decompositions can simplify reasoning and facilitate imagination of novel scenarios. In particular, representing perceptual observations in terms of entities should improve data efficiency and transfer performance on a wide range of tasks. Thus we need models capable of discovering useful decompositions of scenes by identifying units with such regularities and representing them in a common format. To address this problem, we have developed the Multi-Object Network (MONet). In this model, a VAE is trained end-to-end together with a recurrent attention network -- in a purely unsupervised manner -- to provide attention masks around, and reconstructions of, regions of images. We show that this model is capable of learning to decompose and represent challenging 3D scenes into semantically meaningful components, such as objects and background elements.

...read moreread less

346 citations

Posted Content•

Multi-Object Representation Learning with Iterative Variational Inference

[...]

Klaus Greff¹, Raphael Lopez Kaufman, Rishabh Kabra, Nicholas Watters¹, Christopher P. Burgess¹, Daniel Zoran¹, Loic Matthey¹, Matthew Botvinick², Alexander Lerchner¹ - Show less +5 more•Institutions (2)

Google¹, University College London²

01 Mar 2019-arXiv: Learning

TL;DR: In this paper, the authors argue for the importance of learning to segment and represent objects jointly, and demonstrate that, starting from the simple assumption that a scene is composed of multiple entities, it is possible to learn to segment images into interpretable objects with disentangled representations.

...read moreread less

Abstract: Human perception is structured around objects which form the basis for our higher-level cognition and impressive systematic generalization abilities. Yet most work on representation learning focuses on feature learning without even considering multiple objects, or treats segmentation as an (often supervised) preprocessing step. Instead, we argue for the importance of learning to segment and represent objects jointly. We demonstrate that, starting from the simple assumption that a scene is composed of multiple entities, it is possible to learn to segment images into interpretable objects with disentangled representations. Our method learns -- without supervision -- to inpaint occluded parts, and extrapolates to scenes with more objects and to unseen objects with novel feature combinations. We also show that, due to the use of iterative variational inference, our system is able to learn multi-modal posteriors for ambiguous inputs and extends naturally to sequences.

...read moreread less

247 citations

Posted Content•

Stabilizing Transformers for Reinforcement Learning.

[...]

Emilio Parisotto¹, H. Francis Song, Jack W. Rae, Razvan Pascanu, Caglar Gulcehre, Siddhant M. Jayakumar, Max Jaderberg, Raphael Lopez Kaufman, Aidan Clark, Seb Noury, Matthew Botvinick, Nicolas Heess, Raia Hadsell - Show less +9 more•Institutions (1)

Carnegie Mellon University¹

13 Oct 2019-arXiv: Learning

TL;DR: The proposed architecture, the Gated Transformer-XL (GTrXL), surpasses LSTMs on challenging memory environments and achieves state-of-the-art results on the multi-task DMLab-30 benchmark suite, exceeding the performance of an external memory architecture.

...read moreread less

Abstract: Owing to their ability to both effectively integrate information over long time horizons and scale to massive amounts of data, self-attention architectures have recently shown breakthrough success in natural language processing (NLP), achieving state-of-the-art results in domains such as language modeling and machine translation. Harnessing the transformer's ability to process long time horizons of information could provide a similar performance boost in partially observable reinforcement learning (RL) domains, but the large-scale transformers used in NLP have yet to be successfully applied to the RL setting. In this work we demonstrate that the standard transformer architecture is difficult to optimize, which was previously observed in the supervised learning setting but becomes especially pronounced with RL objectives. We propose architectural modifications that substantially improve the stability and learning speed of the original Transformer and XL variant. The proposed architecture, the Gated Transformer-XL (GTrXL), surpasses LSTMs on challenging memory environments and achieves state-of-the-art results on the multi-task DMLab-30 benchmark suite, exceeding the performance of an external memory architecture. We show that the GTrXL, trained using the same losses, has stability and performance that consistently matches or exceeds a competitive LSTM baseline, including on more reactive tasks where memory is less critical. GTrXL offers an easy-to-train, simple-to-implement but substantially more expressive architectural alternative to the standard multi-layer LSTM ubiquitously used for RL agents in partially observable environments.

...read moreread less

159 citations

Journal Article•DOI•

Hierarchical motor control in mammals and machines.

[...]

Josh Merel, Matthew Botvinick, Greg Wayne

02 Dec 2019-Nature Communications

TL;DR: The authors argue for a return to hierarchical models of motor control in neuroscience, and relate them to hierarchy in the nervous system, and highlight research themes that they anticipate will be critical in solving challenges at this disciplinary intersection.

...read moreread less

Abstract: Advances in artificial intelligence are stimulating interest in neuroscience. However, most attention is given to discrete tasks with simple action spaces, such as board games and classic video games. Less discussed in neuroscience are parallel advances in "synthetic motor control". While motor neuroscience has recently focused on optimization of single, simple movements, AI has progressed to the generation of rich, diverse motor behaviors across multiple tasks, at humanoid scale. It is becoming clear that specific, well-motivated hierarchical design elements repeatedly arise when engineering these flexible control systems. We review these core principles of hierarchical control, relate them to hierarchy in the nervous system, and highlight research themes that we anticipate will be critical in solving challenges at this disciplinary intersection.

...read moreread less

146 citations

Proceedings Article•

Multi-Object Representation Learning with Iterative Variational Inference

[...]

Klaus Greff¹, Raphael Lopez Kaufman, Rishabh Kabra, Nicholas Watters¹, Christopher P. Burgess¹, Daniel Zoran¹, Loic Matthey¹, Matthew Botvinick², Alexander Lerchner¹ - Show less +5 more•Institutions (2)

Google¹, University College London²

24 May 2019

TL;DR: In this article, the authors argue for the importance of learning to segment and represent objects jointly, and demonstrate that, starting from the simple assumption that a scene is composed of multiple entities, it is possible to learn to segment images into interpretable objects with disentangled representations.

...read moreread less

Abstract: Human perception is structured around objects which form the basis for our higher-level cognition and impressive systematic generalization abilities. Yet most work on representation learning focuses on feature learning without even considering multiple objects, or treats segmentation as an (often supervised) preprocessing step. Instead, we argue for the importance of learning to segment and represent objects jointly. We demonstrate that, starting from the simple assumption that a scene is composed of multiple entities, it is possible to learn to segment images into interpretable objects with disentangled representations. Our method learns -- without supervision -- to inpaint occluded parts, and extrapolates to scenes with more objects and to unseen objects with novel feature combinations. We also show that, due to the use of iterative variational inference, our system is able to learn multi-modal posteriors for ambiguous inputs and extends naturally to sequences.

...read moreread less

85 citations

Posted Content•

Causal Reasoning from Meta-reinforcement learning

[...]

Ishita Dasgupta¹, Jane X. Wang, Silvia Chiappa, Jovana Mitrovic, Pedro A. Ortega, David Raposo, Edward Hughes, Peter W. Battaglia, Matthew Botvinick, Zeb Kurth-Nelson - Show less +6 more•Institutions (1)

Harvard University¹

23 Jan 2019-arXiv: Learning

TL;DR: It is suggested that causal reasoning in complex settings may benefit from the more end-to-end learning-based approaches presented here, and this work offers new strategies for structured exploration in reinforcement learning, by providing agents with the ability to perform -- and interpret -- experiments.

...read moreread less

Abstract: Discovering and exploiting the causal structure in the environment is a crucial challenge for intelligent agents. Here we explore whether causal reasoning can emerge via meta-reinforcement learning. We train a recurrent network with model-free reinforcement learning to solve a range of problems that each contain causal structure. We find that the trained agent can perform causal reasoning in novel situations in order to obtain rewards. The agent can select informative interventions, draw causal inferences from observational data, and make counterfactual predictions. Although established formal causal reasoning algorithms also exist, in this paper we show that such reasoning can arise from model-free reinforcement learning, and suggest that causal reasoning in complex settings may benefit from the more end-to-end learning-based approaches presented here. This work also offers new strategies for structured exploration in reinforcement learning, by providing agents with the ability to perform -- and interpret -- experiments.

...read moreread less

80 citations

Journal Article•DOI•

Widespread temporal coding of cognitive control in the human prefrontal cortex.

[...]

Elliot H. Smith¹, Guillermo Horga², Mark Yates², Charles B. Mikell³, Garrett P. Banks², Yagna Pathak², Catherine A. Schevon², Guy M. McKhann², Benjamin Y. Hayden, Matthew Botvinick, Sameer A. Sheth⁴ - Show less +7 more•Institutions (4)

University of Utah¹, Columbia University², Stony Brook University³, Baylor College of Medicine⁴

01 Nov 2019-Nature Neuroscience

TL;DR: The hypothesis that a cross-areal rhythmic neuronal coordination is intrinsic to cognitive control in response to conflict is supported, and new evidence is provided to support the hypothesis that conflict processing involves modulation of the dlPFC by the dACC.

...read moreread less

Abstract: When making decisions we often face the need to adjudicate between conflicting strategies or courses of action. Our ability to understand the neuronal processes underlying conflict processing is limited on the one hand by the spatiotemporal resolution of functional MRI and, on the other hand, by imperfect cross-species homologies in animal model systems. Here we examine the responses of single neurons and local field potentials in human neurosurgical patients in two prefrontal regions critical to controlled decision-making, the dorsal anterior cingulate cortex (dACC) and dorsolateral prefrontal cortex (dlPFC). While we observe typical modest conflict-related firing rate effects, we find a widespread effect of conflict on spike-phase coupling in the dACC and on driving spike-field coherence in the dlPFC. These results support the hypothesis that a cross-areal rhythmic neuronal coordination is intrinsic to cognitive control in response to conflict, and provide new evidence to support the hypothesis that conflict processing involves modulation of the dlPFC by the dACC.

...read moreread less

76 citations

Proceedings Article•

Bayesian Action Decoder for Deep Multi-Agent Reinforcement Learning

[...]

Jakob Foerster¹, H. Francis Song², Edward Hughes³, Neil Burch⁴, Iain Dunning³, Shimon Whiteson⁵, Matthew Botvinick⁶, Michael Bowling⁴ - Show less +4 more•Institutions (6)

Facebook¹, Center for Neural Science², Google³, University of Alberta⁴, University of Oxford⁵, University College London⁶

24 May 2019

TL;DR: The Bayesian action decoder (BAD), a new multi-agent learning method that uses an approximate Bayesian update to obtain a public belief that conditions on the actions taken by all agents in the environment are met, is presented.

...read moreread less

Abstract: When observing the actions of others, humans make inferences about why they acted as they did, and what this implies about the world; humans also use the fact that their actions will be interpreted in this manner, allowing them to act informatively and thereby communicate efficiently with others. Although learning algorithms have recently achieved superhuman performance in a number of two-player, zero-sum games, scalable multi-agent reinforcement learning algorithms that can discover effective strategies and conventions in complex, partially observable settings have proven elusive. We present the Bayesian action decoder (BAD), a new multi-agent learning method that uses an approximate Bayesian update to obtain a public belief that conditions on the actions taken by all agents in the environment. BAD introduces a new Markov decision process, the public belief MDP, in which the action space consists of all deterministic partial policies, and exploits the fact that an agent acting only on this public belief state can still learn to use its private information if the action space is augmented to be over all partial policies mapping private information into environment actions. The Bayesian update is closely related to the theory of mind reasoning that humans carry out when observing others' actions. We first validate BAD on a proof-of-principle two-step matrix game, where it outperforms policy gradient methods; we then evaluate BAD on the challenging, cooperative partial-information card game Hanabi, where, in the two-player setting, it surpasses all previously published learning and hand-coded approaches, establishing a new state of the art.

...read moreread less

71 citations

Posted Content•

Meta-learning of Sequential Strategies.

[...]

08 May 2019-arXiv: Learning

TL;DR: This report recast memory-based meta-learning within a Bayesian framework, showing that the meta-learned strategies are near-optimal because they amortize Bayes-filtered data, where the adaptation is implemented in the memory dynamics as a state-machine of sufficient statistics.

...read moreread less

Abstract: In this report we review memory-based meta-learning as a tool for building sample-efficient strategies that learn from past experience to adapt to any task within a target class. Our goal is to equip the reader with the conceptual foundations of this tool for building new, scalable agents that operate on broad domains. To do so, we present basic algorithmic templates for building near-optimal predictors and reinforcement learners which behave as if they had a probabilistic model that allowed them to efficiently exploit task structure. Furthermore, we recast memory-based meta-learning within a Bayesian framework, showing that the meta-learned strategies are near-optimal because they amortize Bayes-filtered data, where the adaptation is implemented in the memory dynamics as a state-machine of sufficient statistics. Essentially, memory-based meta-learning translates the hard problem of probabilistic sequential inference into a regression problem.

...read moreread less

66 citations

Proceedings Article•

InfoBot: Transfer and Exploration via the Information Bottleneck

[...]

Anirudh Goyal¹, Riashat Islam², DJ Strouse³, Zafarali Ahmed², Hugo Larochelle⁴, Matthew Botvinick⁵, Yoshua Bengio¹, Sergey Levine⁶ - Show less +4 more•Institutions (6)

Université de Montréal¹, McGill University², Princeton University³, Google⁴, University College London⁵, University of California, Berkeley⁶

01 Jan 2019

TL;DR: The authors propose to learn about decision states from prior experience by training a goal-conditioned policy with an information bottleneck, which can identify decision states by examining where the model actually leverages the goal state.

...read moreread less

Abstract: A central challenge in reinforcement learning is discovering effective policies for tasks where rewards are sparsely distributed. We postulate that in the absence of useful reward signals, an effective exploration strategy should seek out {\it decision states}. These states lie at critical junctions in the state space from where the agent can transition to new, potentially unexplored regions. We propose to learn about decision states from prior experience. By training a goal-conditioned policy with an information bottleneck, we can identify decision states by examining where the model actually leverages the goal state. We find that this simple mechanism effectively identifies decision states, even in partially observed settings. In effect, the model learns the sensory cues that correlate with potential subgoals. In new environments, this model can then identify novel subgoals for further exploration, guiding the agent through a sequence of potential decision states and through new regions of the state space.

...read moreread less

Posted Content•

Environmental drivers of systematicity and generalization in a situated agent

[...]

Felix Hill¹, Andrew K. Lampinen², Rosalia Schneider¹, Stephen Clark¹, Matthew Botvinick³, James L. McClelland², Adam Santoro¹ - Show less +3 more•Institutions (3)

Google¹, Stanford University², University College London³

01 Oct 2019-arXiv: Artificial Intelligence

TL;DR: It is indicated that the degree of generalisation that networks exhibit can depend critically on particulars of the environment in which a given task is instantiated, and that the propensity for neural networks to generalise in systematic ways may increase if those networks have access to many frames of richly varying, multi-modal observations as they learn.

...read moreread less

Abstract: The question of whether deep neural networks are good at generalising beyond their immediate training experience is of critical importance for learning-based approaches to AI. Here, we consider tests of out-of-sample generalisation that require an agent to respond to never-seen-before instructions by manipulating and positioning objects in a 3D Unity simulated room. We first describe a comparatively generic agent architecture that exhibits strong performance on these tests. We then identify three aspects of the training regime and environment that make a significant difference to its performance: (a) the number of object/word experiences in the training set; (b) the visual invariances afforded by the agent's perspective, or frame of reference; and (c) the variety of visual input inherent in the perceptual aspect of the agent's perception. Our findings indicate that the degree of generalisation that networks exhibit can depend critically on particulars of the environment in which a given task is instantiated. They further suggest that the propensity for neural networks to generalise in systematic ways may increase if, like human children, those networks have access to many frames of richly varying, multi-modal observations as they learn.

...read moreread less

Journal Article•DOI•

Subgoal- and Goal-related Reward Prediction Errors in Medial Prefrontal Cortex.

[...]

Jose J. F. Ribas Fernandes¹, Danesh Shahnazian¹, Clay B. Holroyd¹, Matthew Botvinick²•Institutions (2)

University of Victoria¹, University College London²

01 Jan 2019-Journal of Cognitive Neuroscience

TL;DR: Using fMRI of human participants engaged in a hierarchical navigation task, it is found that mPFC also processes positive prediction errors at the level of subgoals, indicating that this brain region is sensitive to advances in subgoal completion.

...read moreread less

Abstract: A longstanding view of the organization of human and animal behavior holds that behavior is hierarchically organized-in other words, directed toward achieving superordinate goals through the achievement of subordinate goals or subgoals. However, most research in neuroscience has focused on tasks without hierarchical structure. In past work, we have shown that negative reward prediction error (RPE) signals in medial prefrontal cortex (mPFC) can be linked not only to superordinate goals but also to subgoals. This suggests that mPFC tracks impediments in the progression toward subgoals. Using fMRI of human participants engaged in a hierarchical navigation task, here we found that mPFC also processes positive prediction errors at the level of subgoals, indicating that this brain region is sensitive to advances in subgoal completion. However, when subgoal RPEs were elicited alongside with goal-related RPEs, mPFC responses reflected only the goal-related RPEs. These findings suggest that information from different levels of hierarchy is processed selectively, depending on the task context.

...read moreread less

Posted Content•

V-MPO: On-Policy Maximum a Posteriori Policy Optimization for Discrete and Continuous Control

[...]

H. Francis Song¹, Abbas Abdolmaleki¹, Jost Tobias Springenberg¹, Aidan Clark¹, Hubert Soyer², Jack W. Rae¹, Seb Noury³, Arun Ahuja¹, Siqi Liu¹, Dhruva Tirumala¹, Nicolas Heess¹, Dan Belov¹, Martin Riedmiller¹, Matthew Botvinick⁴ - Show less +10 more•Institutions (4)

Google¹, National Institute of Informatics², Palantir Technologies³, University College London⁴

26 Sep 2019-arXiv: Artificial Intelligence

TL;DR: V-MPO is introduced, an on-policy adaptation of Maximum a Posteriori Policy Optimization that performs policy iteration based on a learned state-value function and does so reliably without importance weighting, entropy regularization, or population-based tuning of hyperparameters.

...read moreread less

Abstract: Some of the most successful applications of deep reinforcement learning to challenging domains in discrete and continuous control have used policy gradient methods in the on-policy setting. However, policy gradients can suffer from large variance that may limit performance, and in practice require carefully tuned entropy regularization to prevent policy collapse. As an alternative to policy gradient algorithms, we introduce V-MPO, an on-policy adaptation of Maximum a Posteriori Policy Optimization (MPO) that performs policy iteration based on a learned state-value function. We show that V-MPO surpasses previously reported scores for both the Atari-57 and DMLab-30 benchmark suites in the multi-task setting, and does so reliably without importance weighting, entropy regularization, or population-based tuning of hyperparameters. On individual DMLab and Atari levels, the proposed algorithm can achieve scores that are substantially higher than has previously been reported. V-MPO is also applicable to problems with high-dimensional, continuous action spaces, which we demonstrate in the context of learning to control simulated humanoids with 22 degrees of freedom from full state observations and 56 degrees of freedom from pixel observations, as well as example OpenAI Gym tasks where V-MPO achieves substantially higher asymptotic scores than previously reported.

...read moreread less

Posted Content•

InfoBot: Transfer and Exploration via the Information Bottleneck

[...]

Anirudh Goyal, Riashat Islam, DJ Strouse, Zafarali Ahmed, Matthew Botvinick, Hugo Larochelle, Yoshua Bengio, Sergey Levine - Show less +4 more

30 Jan 2019-arXiv: Machine Learning

TL;DR: The authors propose to learn about decision states from prior experience by training a goal-conditioned policy with an information bottleneck, which can identify decision states by examining where the model actually leverages the goal state.

...read moreread less

Abstract: A central challenge in reinforcement learning is discovering effective policies for tasks where rewards are sparsely distributed. We postulate that in the absence of useful reward signals, an effective exploration strategy should seek out {\it decision states}. These states lie at critical junctions in the state space from where the agent can transition to new, potentially unexplored regions. We propose to learn about decision states from prior experience. By training a goal-conditioned policy with an information bottleneck, we can identify decision states by examining where the model actually leverages the goal state. We find that this simple mechanism effectively identifies decision states, even in partially observed settings. In effect, the model learns the sensory cues that correlate with potential subgoals. In new environments, this model can then identify novel subgoals for further exploration, guiding the agent through a sequence of potential decision states and through new regions of the state space.

...read moreread less

Posted Content•

Emergent Systematic Generalization in a Situated Agent

[...]

Felix Hill¹, Andrew K. Lampinen², Rosalia Schneider¹, Stephen Clark¹, Matthew Botvinick³, James L. McClelland², Adam Santoro¹ - Show less +3 more•Institutions (3)

Google¹, Stanford University², University College London³

01 Oct 2019

TL;DR: In this paper, the authors consider tests of out-of-sample generalization that require an agent to respond to never-seen-before instructions by manipulating and positioning objects in a 3D Unity simulated room.

...read moreread less

Abstract: The question of whether deep neural networks are good at generalising beyond their immediate training experience is of critical importance for learning-based approaches to AI. Here, we consider tests of out-of-sample generalisation that require an agent to respond to never-seen-before instructions by manipulating and positioning objects in a 3D Unity simulated room. We first describe a comparatively generic agent architecture that exhibits strong performance on these tests. We then identify three aspects of the training regime and environment that make a significant difference to its performance: (a) the number of object/word experiences in the training set; (b) the visual invariances afforded by the agent's perspective, or frame of reference; and (c) the variety of visual input inherent in the perceptual aspect of the agent's perception. Our findings indicate that the degree of generalisation that networks exhibit can depend critically on particulars of the environment in which a given task is instantiated. They further suggest that the propensity for neural networks to generalise in systematic ways may increase if, like human children, those networks have access to many frames of richly varying, multi-modal observations as they learn.

...read moreread less

Posted Content•

Learned human-agent decision-making, communication and joint action in a virtual reality environment.

[...]

Patrick M. Pilarski, Andrew Butcher, Michael Johanson, Matthew Botvinick, Andrew Bolt, Adam S. R. Parker - Show less +2 more

07 May 2019-arXiv: Artificial Intelligence

TL;DR: A virtual reality environment wherein a human and an agent can adapt their predictions, their actions, and their communication so as to pursue a simple foraging task is contributed and comparisons suggest the utility of studying human-machine coordination in avirtual reality environment, and identify further research that will expand the understanding of persistent human- machine joint action.

...read moreread less

Abstract: Humans make decisions and act alongside other humans to pursue both short-term and long-term goals. As a result of ongoing progress in areas such as computing science and automation, humans now also interact with non-human agents of varying complexity as part of their day-to-day activities; substantial work is being done to integrate increasingly intelligent machine agents into human work and play. With increases in the cognitive, sensory, and motor capacity of these agents, intelligent machinery for human assistance can now reasonably be considered to engage in joint action with humans---i.e., two or more agents adapting their behaviour and their understanding of each other so as to progress in shared objectives or goals. The mechanisms, conditions, and opportunities for skillful joint action in human-machine partnerships is of great interest to multiple communities. Despite this, human-machine joint action is as yet under-explored, especially in cases where a human and an intelligent machine interact in a persistent way during the course of real-time, daily-life experience. In this work, we contribute a virtual reality environment wherein a human and an agent can adapt their predictions, their actions, and their communication so as to pursue a simple foraging task. In a case study with a single participant, we provide an example of human-agent coordination and decision-making involving prediction learning on the part of the human and the machine agent, and control learning on the part of the machine agent wherein audio communication signals are used to cue its human partner in service of acquiring shared reward. These comparisons suggest the utility of studying human-machine coordination in a virtual reality environment, and identify further research that will expand our understanding of persistent human-machine joint action.

...read moreread less

Journal Article•DOI•

Is Coding a Relevant Metaphor for Building AI

[...]

Adam Santoro¹, Felix Hill¹, David G. T. Barrett¹, David Raposo¹, Matthew Botvinick¹, Timothy P. Lillicrap¹ - Show less +2 more•Institutions (1)

Google¹

28 Nov 2019-Behavioral and Brain Sciences

TL;DR: This paper argued that neural coding metaphor is an insufficient guide for building an artificial intelligence that learns to accomplish short-and long-term goals in a complex, changing environment, and argued that it is not an appropriate metaphor for neural networks.

...read moreread less

Abstract: Brette contends that the neural coding metaphor is an invalid basis for theories of what the brain does. Here, we argue that it is an insufficient guide for building an artificial intelligence that learns to accomplish short- and long-term goals in a complex, changing environment.

...read moreread less

InfoBot: Structured Exploration in ReinforcementLearning Using Information Bottleneck

[...]

Anirudh Goyal, Riashat Islam, DJ Strouse, Zafarali Ahmed, Matthew Botvinick, Hugo Larochelle, Yoshua Bengio, Sergey Levine - Show less +4 more

01 Jan 2019

Journal Article•DOI•

Editorial overview: Artificial intelligence

[...]

Matthew Botvinick, Samuel J. Gershman¹•Institutions (1)

Harvard University¹

13 Aug 2019-Current opinion in behavioral sciences

Posted Content•

Is coding a relevant metaphor for building AI? A commentary on "Is coding a relevant metaphor for the brain?", by Romain Brette

[...]

Adam Santoro, Felix Hill, David G. T. Barrett, David Raposo, Matthew Botvinick, Timothy P. Lillicrap - Show less +2 more

18 Apr 2019-arXiv: Neurons and Cognition

TL;DR: The neural coding metaphor is an invalid basis for theories of what the brain does and is an insufficient guide for building an artificial intelligence that learns to accomplish short- and long-term goals in a complex, changing environment.

...read moreread less

Abstract: Brette contends that the neural coding metaphor is an invalid basis for theories of what the brain does. Here, we argue that it is an insufficient guide for building an artificial intelligence that learns to accomplish short- and long-term goals in a complex, changing environment.

...read moreread less

Showing papers by "Matthew Botvinick published in 2019"