Open Graph Benchmark: Datasets for Machine Learning on Graphs

Open AccessPosted Content

Open Graph Benchmark: Datasets for Machine Learning on Graphs

Weihua Hu, +7 more

- 02 May 2020 -

arXiv: Learning

Chats0

TLDR

The OGB datasets are large-scale, encompass multiple important graph ML tasks, and cover a diverse range of domains, ranging from social and information networks to biological networks, molecular graphs, source code ASTs, and knowledge graphs, indicating fruitful opportunities for future research.

Abstract:

We present the Open Graph Benchmark (OGB), a diverse set of challenging and realistic benchmark datasets to facilitate scalable, robust, and reproducible graph machine learning (ML) research. OGB datasets are large-scale, encompass multiple important graph ML tasks, and cover a diverse range of domains, ranging from social and information networks to biological networks, molecular graphs, source code ASTs, and knowledge graphs. For each dataset, we provide a unified evaluation protocol using meaningful application-specific data splits and evaluation metrics. In addition to building the datasets, we also perform extensive benchmark experiments for each dataset. Our experiments suggest that OGB datasets present significant challenges of scalability to large-scale graphs and out-of-distribution generalization under realistic data splits, indicating fruitful opportunities for future research. Finally, OGB provides an automated end-to-end graph ML pipeline that simplifies and standardizes the process of graph data loading, experimental setup, and model evaluation. OGB will be regularly updated and welcomes inputs from the community. OGB datasets as well as data loaders, evaluation scripts, baseline code, and leaderboards are publicly available at this https URL .

Citations

PDF

Open Access

More filters

Posted Content

Graph Neural Networks: A Review of Methods and Applications

Jie Zhou, +8 more

- 20 Dec 2018 -

arXiv: Learning

TL;DR: A detailed review over existing graph neural network models is provided, systematically categorize the applications, and four open problems for future research are proposed.

...read moreread less

Posted Content

Deep Graph Library: A Graph-Centric, Highly-Performant Package for Graph Neural Networks

Minjie Wang, +14 more

- 03 Sep 2019 -

arXiv: Learning

TL;DR: DGL distills the computational patterns of GNNs into a few generalized sparse tensor operations suitable for extensive parallelization and allows users to easily port and leverage the existing components across multiple deep learning frameworks.

...read moreread less

Posted Content

WILDS: A Benchmark of in-the-Wild Distribution Shifts

Pang Wei Koh, +22 more

- 14 Dec 2020 -

arXiv: Learning

TL;DR: WILDS is presented, a benchmark of in-the-wild distribution shifts spanning diverse data modalities and applications, and is hoped to encourage the development of general-purpose methods that are anchored to real-world distribution shifts and that work well across different applications and problem settings.

...read moreread less

Posted Content

Benchmarking Graph Neural Networks

Vijay Prakash Dwivedi, +4 more

- 02 Mar 2020 -

arXiv: Learning

TL;DR: A reproducible GNN benchmarking framework is introduced, with the facility for researchers to add new models conveniently for arbitrary datasets, and a principled investigation into the recent Weisfeiler-Lehman GNNs (WL-GNNs) compared to message passing-based graph convolutional networks (GCNs).

...read moreread less

Journal Article

Benchmarking Graph Neural Networks

Vijay Prakash Dwivedi, +4 more

Journal of Machine Learning Research

TL;DR: A reproducible GNN benchmarking framework is introduced, with the facility for researchers to add new models conveniently for arbitrary datasets, and a principled investigation into the recent Weisfeiler-Lehman GNNs (WL-GNNs) compared to message passing-based graph convolutional networks (GCNs).

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

Deep Residual Learning for Image Recognition

Kaiming He, +3 more

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.

...read moreread less

Proceedings ArticleDOI

ImageNet: A large-scale hierarchical image database

Jia Deng, +5 more

TL;DR: A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.

...read moreread less

Book ChapterDOI

Microsoft COCO: Common Objects in Context

Tsung-Yi Lin, +7 more

TL;DR: A new dataset with the goal of advancing the state-of-the-art in object recognition by placing the question of object recognition in the context of the broader question of scene understanding by gathering images of complex everyday scenes containing common objects in their natural context.

...read moreread less

Proceedings Article

Distributed Representations of Words and Phrases and their Compositionality

Tomas Mikolov, +4 more

TL;DR: This paper presents a simple method for finding phrases in text, and shows that learning good vector representations for millions of phrases is possible and describes a simple alternative to the hierarchical softmax called negative sampling.

...read moreread less

Posted Content

Semi-Supervised Classification with Graph Convolutional Networks

Thomas Kipf, +1 more

- 09 Sep 2016 -

arXiv: Learning

TL;DR: A scalable approach for semi-supervised learning on graph-structured data that is based on an efficient variant of convolutional neural networks which operate directly on graphs which outperforms related methods by a significant margin.

...read moreread less