scispace - formally typeset
Search or ask a question
Author

H. Francis Song

Bio: H. Francis Song is an academic researcher from Google. The author has contributed to research in topics: Reinforcement learning & Quantum entanglement. The author has an hindex of 22, co-authored 36 publications receiving 3400 citations. Previous affiliations of H. Francis Song include New York University & Center for Neural Science.

Papers
More filters
Posted Content
TL;DR: It is argued that combinatorial generalization must be a top priority for AI to achieve human-like abilities, and that structured representations and computations are key to realizing this objective.
Abstract: Artificial intelligence (AI) has undergone a renaissance recently, making major progress in key domains such as vision, language, control, and decision-making. This has been due, in part, to cheap data and cheap compute resources, which have fit the natural strengths of deep learning. However, many defining characteristics of human intelligence, which developed under much different pressures, remain out of reach for current approaches. In particular, generalizing beyond one's experiences--a hallmark of human intelligence from infancy--remains a formidable challenge for modern AI. The following is part position paper, part review, and part unification. We argue that combinatorial generalization must be a top priority for AI to achieve human-like abilities, and that structured representations and computations are key to realizing this objective. Just as biology uses nature and nurture cooperatively, we reject the false choice between "hand-engineering" and "end-to-end" learning, and instead advocate for an approach which benefits from their complementary strengths. We explore how using relational inductive biases within deep learning architectures can facilitate learning about entities, relations, and rules for composing them. We present a new building block for the AI toolkit with a strong relational inductive bias--the graph network--which generalizes and extends various approaches for neural networks that operate on graphs, and provides a straightforward interface for manipulating structured knowledge and producing structured behaviors. We discuss how graph networks can support relational reasoning and combinatorial generalization, laying the foundation for more sophisticated, interpretable, and flexible patterns of reasoning. As a companion to this paper, we have released an open-source software library for building graph networks, with demonstrations of how to use them in practice.

2,170 citations

Journal ArticleDOI
TL;DR: It is found that after training, recurrent units can develop into clusters that are functionally specialized for different cognitive processes, and a simple yet effective measure is introduced to quantify relationships between single-unit neural representations of tasks.
Abstract: The brain has the ability to flexibly perform many tasks, but the underlying mechanism cannot be elucidated in traditional experimental and modeling studies designed for one task at a time. Here, we trained single network models to perform 20 cognitive tasks that depend on working memory, decision making, categorization, and inhibitory control. We found that after training, recurrent units can develop into clusters that are functionally specialized for different cognitive processes, and we introduce a simple yet effective measure to quantify relationships between single-unit neural representations of tasks. Learning often gives rise to compositionality of task representations, a critical feature for cognitive flexibility, whereby one task can be performed by recombining instructions for other tasks. Finally, networks developed mixed task selectivity similar to recorded prefrontal neurons after learning multiple tasks sequentially with a continual-learning technique. This work provides a computational platform to investigate neural representations of many cognitive tasks.

361 citations

Journal ArticleDOI
TL;DR: In this article, the bipartite fluctuations of particle number N and spin S were investigated in many-body quantum systems, focusing on systems where such U(1) charges are both conserved and fluctuate within subsystems due to exchange of charges between subsystems.
Abstract: We investigate in detail the behavior of the bipartite fluctuations of particle number $\hat{N}$ and spin $\hat{S}^z$ in many-body quantum systems, focusing on systems where such U(1) charges are both conserved and fluctuate within subsystems due to exchange of charges between subsystems. We propose that the bipartite fluctuations are an effective tool for studying many-body physics, particularly its entanglement properties, in the same way that noise and Full Counting Statistics have been used in mesoscopic transport and cold atomic gases. For systems that can be mapped to a problem of non-interacting fermions we show that the fluctuations and higher-order cumulants fully encode the information needed to determine the entanglement entropy as well as the full entanglement spectrum through the R\'{e}nyi entropies. In this connection we derive a simple formula that explicitly relates the eigenvalues of the reduced density matrix to the R\'{e}nyi entropies of integer order for any finite density matrix. In other systems, particularly in one dimension, the fluctuations are in many ways similar but not equivalent to the entanglement entropy. Fluctuations are tractable analytically, computable numerically in both density matrix renormalization group and quantum Monte Carlo calculations, and in principle accessible in condensed matter and cold atom experiments. In the context of quantum point contacts, measurement of the second charge cumulant showing a logarithmic dependence on time would constitute a strong indication of many-body entanglement.

246 citations

Journal ArticleDOI
TL;DR: A framework for gradient descent-based training of excitatory-inhibitory RNNs that can incorporate a variety of biological knowledge is described and an implementation based on the machine learning library Theano is provided, whose automatic differentiation capabilities facilitate modifications and extensions.
Abstract: The ability to simultaneously record from large numbers of neurons in behaving animals has ushered in a new era for the study of the neural circuit mechanisms underlying cognitive functions. One promising approach to uncovering the dynamical and computational principles governing population responses is to analyze model recurrent neural networks (RNNs) that have been optimized to perform the same tasks as behaving animals. Because the optimization of network parameters specifies the desired output but not the manner in which to achieve this output, "trained" networks serve as a source of mechanistic hypotheses and a testing ground for data analyses that link neural computation to behavior. Complete access to the activity and connectivity of the circuit, and the ability to manipulate them arbitrarily, make trained networks a convenient proxy for biological circuits and a valuable platform for theoretical investigation. However, existing RNNs lack basic biological features such as the distinction between excitatory and inhibitory units (Dale's principle), which are essential if RNNs are to provide insights into the operation of biological circuits. Moreover, trained networks can achieve the same behavioral performance but differ substantially in their structure and dynamics, highlighting the need for a simple and flexible framework for the exploratory training of RNNs. Here, we describe a framework for gradient descent-based training of excitatory-inhibitory RNNs that can incorporate a variety of biological knowledge. We provide an implementation based on the machine learning library Theano, whose automatic differentiation capabilities facilitate modifications and extensions. We validate this framework by applying it to well-known experimental paradigms such as perceptual decision-making, context-dependent integration, multisensory integration, parametric working memory, and motor sequence generation. Our results demonstrate the wide range of neural activity patterns and behavior that can be modeled, and suggest a unified setting in which diverse cognitive computations and mechanisms can be studied.

241 citations

Journal ArticleDOI
TL;DR: It is argued that Hanabi elevates reasoning about the beliefs and intentions of other agents to the foreground and developing novel techniques for such theory of mind reasoning will not only be crucial for success in Hanabi, but also in broader collaborative efforts, especially those with human partners.

206 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: This article provides a comprehensive overview of graph neural networks (GNNs) in data mining and machine learning fields and proposes a new taxonomy to divide the state-of-the-art GNNs into four categories, namely, recurrent GNNS, convolutional GNN’s, graph autoencoders, and spatial–temporal Gnns.
Abstract: Deep learning has revolutionized many machine learning tasks in recent years, ranging from image classification and video processing to speech recognition and natural language understanding. The data in these tasks are typically represented in the Euclidean space. However, there is an increasing number of applications, where data are generated from non-Euclidean domains and are represented as graphs with complex relationships and interdependency between objects. The complexity of graph data has imposed significant challenges on the existing machine learning algorithms. Recently, many studies on extending deep learning approaches for graph data have emerged. In this article, we provide a comprehensive overview of graph neural networks (GNNs) in data mining and machine learning fields. We propose a new taxonomy to divide the state-of-the-art GNNs into four categories, namely, recurrent GNNs, convolutional GNNs, graph autoencoders, and spatial–temporal GNNs. We further discuss the applications of GNNs across various domains and summarize the open-source codes, benchmark data sets, and model evaluation of GNNs. Finally, we propose potential research directions in this rapidly growing field.

4,584 citations

Posted Content
TL;DR: A detailed review over existing graph neural network models is provided, systematically categorize the applications, and four open problems for future research are proposed.
Abstract: Lots of learning tasks require dealing with graph data which contains rich relation information among elements. Modeling physics systems, learning molecular fingerprints, predicting protein interface, and classifying diseases demand a model to learn from graph inputs. In other domains such as learning from non-structural data like texts and images, reasoning on extracted structures (like the dependency trees of sentences and the scene graphs of images) is an important research topic which also needs graph reasoning models. Graph neural networks (GNNs) are neural models that capture the dependence of graphs via message passing between the nodes of graphs. In recent years, variants of GNNs such as graph convolutional network (GCN), graph attention network (GAT), graph recurrent network (GRN) have demonstrated ground-breaking performances on many deep learning tasks. In this survey, we propose a general design pipeline for GNN models and discuss the variants of each component, systematically categorize the applications, and propose four open problems for future research.

2,494 citations

Posted Content
TL;DR: PyTorch Geometric is introduced, a library for deep learning on irregularly structured input data such as graphs, point clouds and manifolds, built upon PyTorch, and a comprehensive comparative study of the implemented methods in homogeneous evaluation scenarios is performed.
Abstract: We introduce PyTorch Geometric, a library for deep learning on irregularly structured input data such as graphs, point clouds and manifolds, built upon PyTorch. In addition to general graph data structures and processing methods, it contains a variety of recently published methods from the domains of relational learning and 3D data processing. PyTorch Geometric achieves high data throughput by leveraging sparse GPU acceleration, by providing dedicated CUDA kernels and by introducing efficient mini-batch handling for input examples of different size. In this work, we present the library in detail and perform a comprehensive comparative study of the implemented methods in homogeneous evaluation scenarios.

2,308 citations

Journal ArticleDOI
TL;DR: In this paper, the role of pertubative renormalization group (RG) approaches and self-consistent renormalized spin fluctuation (SCR-SF) theories to understand the quantum-classical crossover in the vicinity of the quantum critical point with generalization to the Kondo effect in heavy-fermion systems is discussed.
Abstract: We give a general introduction to quantum phase transitions in strongly-correlated electron systems. These transitions which occur at zero temperature when a non-thermal parameter $g$ like pressure, chemical composition or magnetic field is tuned to a critical value are characterized by a dynamic exponent $z$ related to the energy and length scales $\Delta$ and $\xi$. Simple arguments based on an expansion to first order in the effective interaction allow to define an upper-critical dimension $D_{C}=4$ (where $D=d+z$ and $d$ is the spatial dimension) below which mean-field description is no longer valid. We emphasize the role of pertubative renormalization group (RG) approaches and self-consistent renormalized spin fluctuation (SCR-SF) theories to understand the quantum-classical crossover in the vicinity of the quantum critical point with generalization to the Kondo effect in heavy-fermion systems. Finally we quote some recent inelastic neutron scattering experiments performed on heavy-fermions which lead to unusual scaling law in $\omega /T$ for the dynamical spin susceptibility revealing critical local modes beyond the itinerant magnetism scheme and mention new attempts to describe this local quantum critical point.

1,347 citations

Journal ArticleDOI
01 Jan 2020
TL;DR: In this paper, the authors propose a general design pipeline for GNN models and discuss the variants of each component, systematically categorize the applications, and propose four open problems for future research.
Abstract: Lots of learning tasks require dealing with graph data which contains rich relation information among elements. Modeling physics systems, learning molecular fingerprints, predicting protein interface, and classifying diseases demand a model to learn from graph inputs. In other domains such as learning from non-structural data like texts and images, reasoning on extracted structures (like the dependency trees of sentences and the scene graphs of images) is an important research topic which also needs graph reasoning models. Graph neural networks (GNNs) are neural models that capture the dependence of graphs via message passing between the nodes of graphs. In recent years, variants of GNNs such as graph convolutional network (GCN), graph attention network (GAT), graph recurrent network (GRN) have demonstrated ground-breaking performances on many deep learning tasks. In this survey, we propose a general design pipeline for GNN models and discuss the variants of each component, systematically categorize the applications, and propose four open problems for future research.

1,266 citations