scispace - formally typeset
Open AccessProceedings ArticleDOI

Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks

Reads0
Chats0
TLDR
Cluster-GCN as discussed by the authors is a novel GCN algorithm that is suitable for SGD-based training by exploiting the graph clustering structure, where at each step, it samples a block of nodes that associate with a dense subgraph and restricts the neighborhood search within this subgraph.
Abstract
Graph convolutional network (GCN) has been successfully applied to many graph-based applications; however, training a large-scale GCN remains challenging. Current SGD-based algorithms suffer from either a high computational cost that exponentially grows with number of GCN layers, or a large space requirement for keeping the entire graph and the embedding of each node in memory. In this paper, we propose Cluster-GCN, a novel GCN algorithm that is suitable for SGD-based training by exploiting the graph clustering structure. Cluster-GCN works as the following: at each step, it samples a block of nodes that associate with a dense subgraph identified by a graph clustering algorithm, and restricts the neighborhood search within this subgraph. This simple but effective strategy leads to significantly improved memory and computational efficiency while being able to achieve comparable test accuracy with previous algorithms. To test the scalability of our algorithm, we create a new Amazon2M data with 2 million nodes and 61 million edges which is more than 5 times larger than the previous largest publicly available dataset (Reddit). For training a 3-layer GCN on this data, Cluster-GCN is faster than the previous state-of-the-art VR-GCN (1523 seconds vs 1961 seconds) and using much less memory (2.2GB vs 11.2GB). Furthermore, for training 4 layer GCN on this data, our algorithm can finish in around 36 minutes while all the existing GCN training algorithms fail to train due to the out-of-memory issue. Furthermore, Cluster-GCN allows us to train much deeper GCN without much time and memory overhead, which leads to improved prediction accuracy---using a 5-layer Cluster-GCN, we achieve state-of-the-art test F1 score 99.36 on the PPI dataset, while the previous best result was 98.71 by [16]. Our codes are publicly available at this https URL.

read more

Citations
More filters
Journal Article

Benchmarking Graph Neural Networks

TL;DR: A reproducible GNN benchmarking framework is introduced, with the facility for researchers to add new models conveniently for arbitrary datasets, and a principled investigation into the recent Weisfeiler-Lehman GNNs (WL-GNNs) compared to message passing-based graph convolutional networks (GCNs).
Journal ArticleDOI

A Metaverse: Taxonomy, Components, Applications, and Open Challenges

- 01 Jan 2022 - 
TL;DR: In this article , the authors divide the concepts and essential techniques necessary for realizing the Metaverse into three components (i.e., hardware, software, and contents) rather than marketing or hardware approach to conduct a comprehensive analysis.
Journal ArticleDOI

A Metaverse: Taxonomy, Components, Applications, and Open Challenges

TL;DR: This paper divides the concepts and essential techniques necessary for realizing the Metaverse into three components (i.e., hardware, software, and contents) and three approaches and describes essential methods based on three components and techniques to Metaverse’s representative Ready Player One, Roblox, and Facebook research in the domain of films, games, and studies.
Proceedings Article

Decoupling the Depth and Scope of Graph Neural Networks

TL;DR: This work proposes a design principle to decouple the depth and scope of GNNs – to generate representation of a target entity, first extract a localized subgraph as the bounded-size scope, and then apply a GNN of arbitrary depth on top of the subgraph.
Journal Article

Data Augmentation for Deep Graph Learning: A Survey

TL;DR: A taxonomy for graph data augmentation techniques is proposed and a structured review by categorizing the related work based on the augmented information modalities is provided, which points out promising research directions and the challenges in future research.
References
More filters
Proceedings ArticleDOI

Deep Residual Learning for Image Recognition

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.

Automatic differentiation in PyTorch

TL;DR: An automatic differentiation module of PyTorch is described — a library designed to enable rapid research on machine learning models that focuses on differentiation of purely imperative programs, with a focus on extensibility and low overhead.
Posted Content

Inductive Representation Learning on Large Graphs

TL;DR: GraphSAGE is presented, a general, inductive framework that leverages node feature information (e.g., text attributes) to efficiently generate node embeddings for previously unseen data and outperforms strong baselines on three inductive node-classification benchmarks.
Proceedings ArticleDOI

Graph Attention Networks

TL;DR: Graph Attention Networks (GATs) as mentioned in this paper leverage masked self-attentional layers to address the shortcomings of prior methods based on graph convolutions or their approximations.
Related Papers (5)