Opening the black box: Low-dimensional dynamics in high-dimensional recurrent neural networks

doi:10.1162/NECO_A_00409

Home
/
Papers
/
Opening the black box: Low-dimensional dynamics in high-dimensional recurrent neural networks

Journal Article•DOI•

Opening the black box: Low-dimensional dynamics in high-dimensional recurrent neural networks

David Sussillo¹, Omri Barak²•Institutions (2)

Stanford University¹, Columbia University²

01 Mar 2013-Neural Computation (MIT Press 55 Hayward Street, Cambridge, MA 02142-1315 email: journals-info@mit.edu)-Vol. 25, Iss: 3, pp 626-649

TL;DR: The hypothesis that fixed points, both stable and unstable, and the linearized dynamics around them, can reveal crucial aspects of how RNNs implement their computations is explored.

read less

Abstract: Recurrent neural networks RNNs are useful tools for learning nonlinear relationships between time-varying inputs and outputs with complex temporal dependencies. Recently developed algorithms have been successful at training RNNs to perform a wide variety of tasks, but the resulting networks have been treated as black boxes: their mechanism of operation remains unknown. Here we explore the hypothesis that fixed points, both stable and unstable, and the linearized dynamics around them, can reveal crucial aspects of how RNNs implement their computations. Further, we explore the utility of linearization in areas of phase space that are not true fixed points but merely points of very slow movement. We present a simple optimization technique that is applied to trained RNNs to find the fixed and slow points of their dynamics. Linearization around these slow regions can be used to explore, or reverse-engineer, the behavior of the RNN. We describe the technique, illustrate it using simple examples, and finally showcase it on three high-dimensional RNN examples: a 3-bit flip-flop device, an input-dependent sine wave generator, and a two-point moving average. In all cases, the mechanisms of trained networks could be inferred from the sets of fixed and slow points and the linearized dynamics around them.

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Context-dependent computation by recurrent dynamics in prefrontal cortex

[...]

Valerio Mante¹, David Sussillo², Krishna V. Shenoy², William T. Newsome¹•Institutions (2)

Howard Hughes Medical Institute¹, Stanford University²

07 Nov 2013-Nature

TL;DR: This work studies prefrontal cortex activity in macaque monkeys trained to flexibly select and integrate noisy sensory inputs towards a choice, and finds that the observed complexity and functional roles of single neurons are readily understood in the framework of a dynamical process unfolding at the level of the population.

...read moreread less

Abstract: Prefrontal cortex is thought to have a fundamental role in flexible, context-dependent behaviour, but the exact nature of the computations underlying this role remains largely unknown. In particular, individual prefrontal neurons often generate remarkably complex responses that defy deep understanding of their contribution to behaviour. Here we study prefrontal cortex activity in macaque monkeys trained to flexibly select and integrate noisy sensory inputs towards a choice. We find that the observed complexity and functional roles of single neurons are readily understood in the framework of a dynamical process unfolding at the level of the population. The population dynamics can be reproduced by a trained recurrent neural network, which suggests a previously unknown mechanism for selection and integration of task-relevant inputs. This mechanism indicates that selection and integration are two aspects of a single dynamical process unfolding within the same prefrontal circuits, and potentially provides a novel, general framework for understanding context-dependent computations.

...read moreread less

1,416 citations

Book•

Neuronal Dynamics: From Single Neurons to Networks and Models of Cognition

[...]

Wulfram Gerstner¹, Werner M. Kistler, Richard Naud², Liam Paninski³•Institutions (3)

École Polytechnique Fédérale de Lausanne¹, University of Ottawa², Columbia University³

22 Sep 2014

TL;DR: This textbook for advanced undergraduate and beginning graduate students provides a thorough and up-to-date introduction to the fields of computational and theoretical neuroscience.

...read moreread less

Abstract: What happens in our brain when we make a decision? What triggers a neuron to send out a signal? What is the neural code? This textbook for advanced undergraduate and beginning graduate students provides a thorough and up-to-date introduction to the fields of computational and theoretical neuroscience. It covers classical topics, including the Hodgkin-Huxley equations and Hopfield model, as well as modern developments in the field such as Generalized Linear Models and decision theory. Concepts are introduced using clear step-by-step explanations suitable for readers with only a basic knowledge of differential equations and probabilities, and are richly illustrated by figures and worked-out examples. End-of-chapter summaries and classroom-tested exercises make the book ideal for courses or for self-study. The authors also give pointers to the literature and an extensive bibliography, which will prove invaluable to readers interested in further study.

...read moreread less

942 citations

Journal Article•DOI•

The practical implementation of artificial intelligence technologies in medicine.

[...]

Jianxing He¹, Sally L. Baxter², Sally L. Baxter³, Jie Xu⁴, Jiming Xu, Xingtao Zhou⁵, Kang Zhang - Show less +3 more•Institutions (5)

Guangzhou Medical University¹, University of California, San Diego², Veterans Health Administration³, Capital Medical University⁴, Fudan University⁵

07 Jan 2019-Nature Medicine

TL;DR: The current regulatory environment in the United States is summarized and comparisons are highlighted with other regions in the world, notably Europe and China, to bring the full potential of AI to the clinic.

...read moreread less

Abstract: The development of artificial intelligence (AI)-based technologies in medicine is advancing rapidly, but real-world clinical implementation has not yet become a reality. Here we review some of the key practical issues surrounding the implementation of AI into existing clinical workflows, including data sharing and privacy, transparency of algorithms, data standardization, and interoperability across multiple platforms, and concern for patient safety. We summarize the current regulatory environment in the United States and highlight comparisons with other regions in the world, notably Europe and China.

...read moreread less

904 citations

Journal Article•DOI•

Artificial Intelligence in Surgery: Promises and Perils

[...]

Daniel A. Hashimoto¹, Guy Rosman², Daniela Rus², Ozanan R. Meireles¹•Institutions (2)

Harvard University¹, Massachusetts Institute of Technology²

01 Jul 2018-Annals of Surgery

TL;DR: Surgeons are well positioned to help integrate AI into modern practice and should partner with data scientists to capture data across phases of care and to provide clinical context, for AI has the potential to revolutionize the way surgery is taught and practiced.

...read moreread less

Abstract: Objective:The aim of this review was to summarize major topics in artificial intelligence (AI), including their applications and limitations in surgery. This paper reviews the key capabilities of AI to help surgeons understand and critically evaluate new AI applications and to contribute to new deve

...read moreread less

515 citations

Cites background from "Opening the black box: Low-dimensio..."

...Un An important concern regarding AI algorithms involves their interpretability,(51) for techniques such as neural networks are based on a ‘‘black box’’ design.(52) Although the automated nature of neural networks allows for detection of patterns missed by humans, human scientists are left with little ability to assess how or why such patterns were discerned by the computer....
[...]

Journal Article•DOI•

Cocreation of value in a platform ecosystem: the case of enterprise software

[...]

Marco Ceccagnoli¹, Chris Forman¹, Peng Huang², D. J. Wu¹•Institutions (2)

Georgia Institute of Technology¹, University of Maryland, College Park²

01 Mar 2012-Management Information Systems Quarterly

TL;DR: In this article, the authors examine whether participation in an ecosystem partnership improves the business performance of small independent software vendors (ISVs) in the enterprise software industry and how appropriability mechanisms influence the benefits of partnership.

...read moreread less

Abstract: It has been argued that platform technology owners cocreate business value with other firms in their platform ecosystems by encouraging complementary invention and exploiting indirect network effects. In this study, we examine whether participation in an ecosystem partnership improves the business performance of small independent software vendors (ISVs) in the enterprise software industry and how appropriability mechanisms influence the benefits of partnership. By analyzing the partnering activities and performance indicators of a sample of 1,210 small ISVs over the period 1996-2004, we find that joining a major platform owner's platform ecosystem is associated with an increase in sales and a greater likelihood of issuing an initial public offering (IPO). In addition, we show that these impacts are greater when ISVs have greater intellectual property rights or stronger downstream capabilities. This research highlights the value of interoperability between software products, and stresses that value cocreation and appropriation are not mutually exclusive strategies in interfirm collaboration.

...read moreread less

484 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92

Collapse

References

PDF

Open Access

More filters

Book•

Convex Optimization

[...]

Stephen Boyd¹, Lieven Vandenberghe²•Institutions (2)

Stanford University¹, University of California, Los Angeles²

01 Mar 2004

TL;DR: In this article, the focus is on recognizing convex optimization problems and then finding the most appropriate technique for solving them, and a comprehensive introduction to the subject is given. But the focus of this book is not on the optimization problem itself, but on the problem of finding the appropriate technique to solve it.

...read moreread less

Abstract: Convex optimization problems arise frequently in many different fields. A comprehensive introduction to the subject, this book shows in detail how such problems can be solved numerically with great efficiency. The focus is on recognizing convex optimization problems and then finding the most appropriate technique for solving them. The text contains many worked examples and homework exercises and will appeal to students, researchers and practitioners in fields such as engineering, computer science, mathematics, statistics, finance, and economics.

...read moreread less

33,341 citations

Book Chapter•DOI•

Learning internal representations by error propagation

[...]

David E. Rumelhart, Geoffrey E. Hinton, Ronald J. Williams

01 Jan 1988

TL;DR: This chapter contains sections titled: The Problem, The Generalized Delta Rule, Simulation Results, Some Further Generalizations, Conclusion.

...read moreread less

Abstract: This chapter contains sections titled: The Problem, The Generalized Delta Rule, Simulation Results, Some Further Generalizations, Conclusion

...read moreread less

17,604 citations

Journal Article•DOI•

Neural networks and physical systems with emergent collective computational abilities

[...]

John J. Hopfield

01 Apr 1982-Proceedings of the National Academy of Sciences of the United States of America

TL;DR: A model of a system having a large number of simple equivalent components, based on aspects of neurobiology but readily adapted to integrated circuits, produces a content-addressable memory which correctly yields an entire memory from any subpart of sufficient size.

...read moreread less

Abstract: Computational properties of use of biological organisms or to the construction of computers can emerge as collective properties of systems having a large number of simple equivalent components (or neurons). The physical meaning of content-addressable memory is described by an appropriate phase space flow of the state of a system. A model of such a system is given, based on aspects of neurobiology but readily adapted to integrated circuits. The collective properties of this model produce a content-addressable memory which correctly yields an entire memory from any subpart of sufficient size. The algorithm for the time evolution of the state of the system is based on asynchronous parallel processing. Additional emergent collective properties include some capacity for generalization, familiarity recognition, categorization, error correction, and time sequence retention. The collective properties are only weakly sensitive to details of the modeling or the failure of individual devices.

...read moreread less

16,652 citations

"Opening the black box: Low-dimensio..." refers result in this paper

...This is in contrast to network models that are explicitly constructed to implement a specific known mechanism (see Wang, 2008; Hopfield, 1982)....
[...]

Book•

Learning internal representations by error propagation

[...]

David E. Rumelhart, Geoffrey E. Hinton, Ronald J. Williams

03 Jan 1986

TL;DR: In this paper, the problem of the generalized delta rule is discussed and the Generalized Delta Rule is applied to the simulation results of simulation results in terms of the generalized delta rule.

...read moreread less

Abstract: This chapter contains sections titled: The Problem, The Generalized Delta Rule, Simulation Results, Some Further Generalizations, Conclusion

...read moreread less

13,579 citations

Journal Article•DOI•

Learning long-term dependencies with gradient descent is difficult

[...]

Yoshua Bengio¹, Patrice Y. Simard², Paolo Frasconi³•Institutions (3)

Université de Montréal¹, AT&T², University of Florence³

01 Mar 1994-IEEE Transactions on Neural Networks

TL;DR: This work shows why gradient based learning algorithms face an increasingly difficult problem as the duration of the dependencies to be captured increases, and exposes a trade-off between efficient learning by gradient descent and latching on information for long periods.

...read moreread less

Abstract: Recurrent neural networks can be used to map input sequences to output sequences, such as for recognition, production or prediction problems. However, practical difficulties have been reported in training recurrent neural networks to perform tasks in which the temporal contingencies present in the input/output sequences span long intervals. We show why gradient based learning algorithms face an increasingly difficult problem as the duration of the dependencies to be captured increases. These results expose a trade-off between efficient learning by gradient descent and latching on information for long periods. Based on an understanding of this problem, alternatives to standard gradient descent are considered. >

...read moreread less

7,309 citations