DisGCo: A Compiler for Distributed Graph Analytics

doi:10.1145/3414469

Open AccessJournal ArticleDOI

DisGCo: A Compiler for Distributed Graph Analytics

Anchu Rajendran, +1 more

- 30 Sep 2020 -

ACM Transactions on Architecture and Cod...

- Vol. 17, Iss: 4, pp 1-26

TLDR

DisGCo is the first graph DSL compiler that can handle all syntactic capabilities of a practical graph DSL like Green-Marl and generate code that can run on distributed systems.

Abstract:

Graph algorithms are widely used in various applications. Their programmability and performance have garnered a lot of interest among the researchers. Being able to run these graph analytics programs on distributed systems is an important requirement. Green-Marl is a popular Domain Specific Language (DSL) for coding graph algorithms and is known for its simplicity. However, the existing Green-Marl compiler for distributed systems (Green-Marl to Pregel) can only compile limited types of Green-Marl programs (in Pregel canonical form). This severely restricts the types of parallel Green-Marl programs that can be executed on distributed systems. We present DisGCo, the first compiler to translate any general Green-Marl program to equivalent MPI program that can run on distributed systems. Translating Green-Marl programs to MPI (SPMD/MPMD style of computation, distributed memory) presents many other exciting challenges, besides the issues related to differences in syntax, as Green-Marl gives the programmer a unified view of the whole memory and allows the parallel and serial code to be inter-mixed. We first present the set of challenges involved in translating Green-Marl programs to MPI and then present a systematic approach to do the translation. We also present a few optimization techniques to improve the performance of our generated programs. DisGCo is the first graph DSL compiler that can handle all syntactic capabilities of a practical graph DSL like Green-Marl and generate code that can run on distributed systems. Our preliminary evaluation of DisGCo shows that our generated programs are scalable. Further, compared to the state-of-the-art DH-Falcon compiler that translates a subset of Falcon programs to MPI, our generated codes exhibit a geomean speedup of 17.32×.

DisGCo: A Compiler for Distributed Graph Analytics

Citations

StarPlat: A Versatile DSL for Graph Analytics

Constructing an AI Compiler for ARM Cortex-M Devices

Arbitrarily Parallelizable Code: A Model of Computation Evaluated on a Message-Passing Many-Core System

References

A lightweight infrastructure for graph analytics

GPS: a graph processing system

Balanced Graph Partitioning

Trinity: a distributed graph engine on a memory cloud

GraphLab: A New Parallel Framework for Machine Learning

Related Papers (5)

Green-Marl: a DSL for easy and efficient graph analysis

Early experiences in using a domain-specific language for large-scale graph analysis

Towards Systematic Parallelization of Graph Transformations Over Pregel

Language and compiler support for stream programs

Compiling sequential programs for distributed memory parallel computers with Pandore II