scispace - formally typeset
Search or ask a question
Posted Content

Rotation Invariant Graph Neural Networks using Spin Convolutions

TL;DR: In this paper, the angular information between sets of neighboring atoms in a graph neural network is modeled by using a per-edge local coordinate frame and a spin convolution over the remaining degree of freedom.
Abstract: Progress towards the energy breakthroughs needed to combat climate change can be significantly accelerated through the efficient simulation of atomic systems. Simulation techniques based on first principles, such as Density Functional Theory (DFT), are limited in their practical use due to their high computational expense. Machine learning approaches have the potential to approximate DFT in a computationally efficient manner, which could dramatically increase the impact of computational simulations on real-world problems. Approximating DFT poses several challenges. These include accurately modeling the subtle changes in the relative positions and angles between atoms, and enforcing constraints such as rotation invariance or energy conservation. We introduce a novel approach to modeling angular information between sets of neighboring atoms in a graph neural network. Rotation invariance is achieved for the network's edge messages through the use of a per-edge local coordinate frame and a novel spin convolution over the remaining degree of freedom. Two model variants are proposed for the applications of structure relaxation and molecular dynamics. State-of-the-art results are demonstrated on the large-scale Open Catalyst 2020 dataset. Comparisons are also performed on the MD17 and QM9 datasets.
Citations
More filters
Posted Content
TL;DR: GemNet as discussed by the authors uses two-hop message passing to predict molecular interactions and achieves state-of-the-art performance on the COLL and MD17 molecular dynamics datasets.
Abstract: Effectively predicting molecular interactions has the potential to accelerate molecular dynamics by multiple orders of magnitude and thus revolutionize chemical simulations. Graph neural networks (GNNs) have recently shown great successes for this task, overtaking classical methods based on fixed molecular kernels. However, they still appear very limited from a theoretical perspective, since regular GNNs cannot distinguish certain types of graphs. In this work we close this gap between theory and practice. We show that GNNs with directed edge embeddings and two-hop message passing are indeed universal approximators for predictions that are invariant to global rotation and translation, and equivariant to permutation. We then leverage these insights and multiple structural improvements to propose the geometric message passing neural network (GemNet). We demonstrate the benefits of the proposed changes in multiple ablation studies. GemNet outperforms previous models on the COLL and MD17 molecular dynamics datasets by 34% and 40%, performing especially well on the most challenging molecules.

1 citations

Posted Content
TL;DR: In this paper, the authors propose a crystal diffusion variational autoencoder (CDVAE) that captures the physical inductive bias of material stability by learning from the data distribution of stable materials.
Abstract: Generating the periodic structure of stable materials is a long-standing challenge for the material design community. This task is difficult because stable materials only exist in a low-dimensional subspace of all possible periodic arrangements of atoms: 1) the coordinates must lie in the local energy minimum defined by quantum mechanics, and 2) global stability also requires the structure to follow the complex, yet specific bonding preferences between different atom types. Existing methods fail to incorporate these factors and often lack proper invariances. We propose a Crystal Diffusion Variational Autoencoder (CDVAE) that captures the physical inductive bias of material stability. By learning from the data distribution of stable materials, the decoder generates materials in a diffusion process that moves atomic coordinates towards a lower energy state and updates atom types to satisfy bonding preferences between neighbors. Our model also explicitly encodes interactions across periodic boundaries and respects permutation, translation, rotation, and periodic invariances. We significantly outperform past methods in three tasks: 1) reconstructing the input structure, 2) generating valid, diverse, and realistic materials, and 3) generating materials that optimize a specific property. We also provide several standard datasets and evaluation metrics for the broader machine learning community.

1 citations

Posted Content
TL;DR: 3D-Transformer as mentioned in this paper proposes a multi-scale self-attention module that exploits local fine-grained patterns with increasing contextual scales to deal with the non-uniformity of interatomic distances.
Abstract: Spatial structures in the 3D space are important to determine molecular properties. Recent papers use geometric deep learning to represent molecules and predict properties. These papers, however, are computationally expensive in capturing long-range dependencies of input atoms; and have not considered the non-uniformity of interatomic distances, thus failing to learn context-dependent representations at different scales. To deal with such issues, we introduce 3D-Transformer, a variant of the Transformer for molecular representations that incorporates 3D spatial information. 3D-Transformer operates on a fully-connected graph with direct connections between atoms. To cope with the non-uniformity of interatomic distances, we develop a multi-scale self-attention module that exploits local fine-grained patterns with increasing contextual scales. As molecules of different sizes rely on different kinds of spatial features, we design an adaptive position encoding module that adopts different position encoding methods for small and large molecules. Finally, to attain the molecular representation from atom embeddings, we propose an attentive farthest point sampling algorithm that selects a portion of atoms with the assistance of attention scores, overcoming handicaps of the virtual node and previous distance-dominant downsampling methods. We validate 3D-Transformer across three important scientific domains: quantum chemistry, material science, and proteomics. Our experiments show significant improvements over state-of-the-art models on the crystal property prediction task and the protein-ligand binding affinity prediction task, and show better or competitive performance in quantum chemistry molecular datasets. This work provides clear evidence that biochemical tasks can gain consistent benefits from 3D molecular representations and different tasks require different position encoding methods.
Posted Content
TL;DR: Frame Averaging (FA) as discussed by the authors is a general purpose and systematic framework for adapting known (backbone) architectures to become invariant or equivariant to new symmetry types.
Abstract: Many machine learning tasks involve learning functions that are known to be invariant or equivariant to certain symmetries of the input data. However, it is often challenging to design neural network architectures that respect these symmetries while being expressive and computationally efficient. For example, Euclidean motion invariant/equivariant graph or point cloud neural networks. We introduce Frame Averaging (FA), a general purpose and systematic framework for adapting known (backbone) architectures to become invariant or equivariant to new symmetry types. Our framework builds on the well known group averaging operator that guarantees invariance or equivariance but is intractable. In contrast, we observe that for many important classes of symmetries, this operator can be replaced with an averaging operator over a small subset of the group elements, called a frame. We show that averaging over a frame guarantees exact invariance or equivariance while often being much simpler to compute than averaging over the entire group. Furthermore, we prove that FA-based models have maximal expressive power in a broad setting and in general preserve the expressive power of their backbone architectures. Using frame averaging, we propose a new class of universal Graph Neural Networks (GNNs), universal Euclidean motion invariant point cloud networks, and Euclidean motion invariant Message Passing (MP) GNNs. We demonstrate the practical effectiveness of FA on several applications including point cloud normal estimation, beyond $2$-WL graph separation, and $n$-body dynamics prediction, achieving state-of-the-art results in all of these benchmarks.
References
More filters
Proceedings Article
04 Apr 2017
TL;DR: The Message Passing Neural Networks (MPNNs) as mentioned in this paper are a generalization of the message passing algorithm and aggregation procedure to compute a function of their entire input graph, and have been shown to achieve state-of-the-art results on an important molecular property prediction benchmark.
Abstract: Supervised learning on molecules has incredible potential to be useful in chemistry, drug discovery, and materials science. Luckily, several promising and closely related neural network models invariant to molecular symmetries have already been described in the literature. These models learn a message passing algorithm and aggregation procedure to compute a function of their entire input graph. At this point, the next step is to find a particularly effective variant of this general approach and apply it to chemical prediction benchmarks until we either solve them or reach the limits of the approach. In this paper, we reformulate existing models into a single common framework we call Message Passing Neural Networks (MPNNs) and explore additional novel variations within this framework. Using MPNNs we demonstrate state of the art results on an important molecular property prediction benchmark; these results are strong enough that we believe future work should focus on datasets with larger molecules or more accurate ground truth labels.

3,219 citations

Proceedings Article
01 Apr 2016
TL;DR: This work studies feature learning techniques for graph-structured inputs and achieves state-of-the-art performance on a problem from program verification, in which subgraphs need to be matched to abstract data structures.
Abstract: Graph-structured data appears frequently in domains including chemistry, natural language semantics, social networks, and knowledge bases. In this work, we study feature learning techniques for graph-structured inputs. Our starting point is previous work on Graph Neural Networks (Scarselli et al., 2009), which we modify to use gated recurrent units and modern optimization techniques and then extend to output sequences. The result is a flexible and broadly useful class of neural network models that has favorable inductive biases relative to purely sequence-based models (e.g., LSTMs) when the problem is graph-structured. We demonstrate the capabilities on some simple AI (bAbI) and graph algorithm learning tasks. We then show it achieves state-of-the-art performance on a problem from program verification, in which subgraphs need to be matched to abstract data structures.

2,518 citations

Journal ArticleDOI
15 Jan 2020-Nature
TL;DR: It is shown that a neural network can be trained to make accurate predictions of the distances between pairs of residues, which convey more information about the structure than contact predictions, and the resulting potential can be optimized by a simple gradient descent algorithm to generate structures without complex sampling procedures.
Abstract: Protein structure prediction can be used to determine the three-dimensional shape of a protein from its amino acid sequence1. This problem is of fundamental importance as the structure of a protein largely determines its function2; however, protein structures can be difficult to determine experimentally. Considerable progress has recently been made by leveraging genetic information. It is possible to infer which amino acid residues are in contact by analysing covariation in homologous sequences, which aids in the prediction of protein structures3. Here we show that we can train a neural network to make accurate predictions of the distances between pairs of residues, which convey more information about the structure than contact predictions. Using this information, we construct a potential of mean force4 that can accurately describe the shape of a protein. We find that the resulting potential can be optimized by a simple gradient descent algorithm to generate structures without complex sampling procedures. The resulting system, named AlphaFold, achieves high accuracy, even for sequences with fewer homologous sequences. In the recent Critical Assessment of Protein Structure Prediction5 (CASP13)-a blind assessment of the state of the field-AlphaFold created high-accuracy structures (with template modelling (TM) scores6 of 0.7 or higher) for 24 out of 43 free modelling domains, whereas the next best method, which used sampling and contact information, achieved such accuracy for only 14 out of 43 domains. AlphaFold represents a considerable advance in protein-structure prediction. We expect this increased accuracy to enable insights into the function and malfunction of proteins, especially in cases for which no structures for homologous proteins have been experimentally determined7.

2,097 citations

Posted Content
TL;DR: Group Normalization can outperform its BN-based counterparts for object detection and segmentation in COCO, and for video classification in Kinetics, showing that GN can effectively replace the powerful BN in a variety of tasks.
Abstract: Batch Normalization (BN) is a milestone technique in the development of deep learning, enabling various networks to train. However, normalizing along the batch dimension introduces problems --- BN's error increases rapidly when the batch size becomes smaller, caused by inaccurate batch statistics estimation. This limits BN's usage for training larger models and transferring features to computer vision tasks including detection, segmentation, and video, which require small batches constrained by memory consumption. In this paper, we present Group Normalization (GN) as a simple alternative to BN. GN divides the channels into groups and computes within each group the mean and variance for normalization. GN's computation is independent of batch sizes, and its accuracy is stable in a wide range of batch sizes. On ResNet-50 trained in ImageNet, GN has 10.6% lower error than its BN counterpart when using a batch size of 2; when using typical batch sizes, GN is comparably good with BN and outperforms other normalization variants. Moreover, GN can be naturally transferred from pre-training to fine-tuning. GN can outperform its BN-based counterparts for object detection and segmentation in COCO, and for video classification in Kinetics, showing that GN can effectively replace the powerful BN in a variety of tasks. GN can be easily implemented by a few lines of code in modern libraries.

1,924 citations

Proceedings ArticleDOI
27 Dec 2005
TL;DR: A new neural model, called graph neural network (GNN), capable of directly processing graphs, which extends recursive neural networks and can be applied on most of the practically useful kinds of graphs, including directed, undirected, labelled and cyclic graphs.
Abstract: In several applications the information is naturally represented by graphs. Traditional approaches cope with graphical data structures using a preprocessing phase which transforms the graphs into a set of flat vectors. However, in this way, important topological information may be lost and the achieved results may heavily depend on the preprocessing stage. This paper presents a new neural model, called graph neural network (GNN), capable of directly processing graphs. GNNs extends recursive neural networks and can be applied on most of the practically useful kinds of graphs, including directed, undirected, labelled and cyclic graphs. A learning algorithm for GNNs is proposed and some experiments are discussed which assess the properties of the model.

1,569 citations