GraphIVE: Heterogeneity-Aware Adaptive Graph Partitioning in GraphLab

doi:10.1109/ICPPW.2014.25

Home
/
Papers
/
GraphIVE: Heterogeneity-Aware Adaptive Graph Partitioning in GraphLab

Proceedings Article•DOI•

GraphIVE: Heterogeneity-Aware Adaptive Graph Partitioning in GraphLab

Dinesh Kumar¹, Arun Raj¹, Deepankar Patra¹, Dharanipragada Janakiram¹•Institutions (1)

Indian Institute of Technology Madras¹

09 Sep 2014-pp 95-103

TL;DR: This work determines the extent to which the current scheduler in GraphLab can handle heterogeneity, and proposes GraphIVE (Graph Processing In Varied Environments), a capability-aware graph partitioning policy for GraphLab applications that significantly improves the execution time of jobs.

read less

Abstract: GraphLab, distributed graph-processing framework, has found multiple applications in data-mining Its scalability makes it the perfect choice for running graph algorithms on large data The current scheduler in GraphLab splits the graph based on various partitioning strategies These strategies split the graph into approximately equal parts, which is suited for homogeneous clusters, but is liable to perform poorly in the presence of heterogeneity A number of challenges arise when the nodes differ in memory and processing power We show that memory in particular can be a severe bottleneck, even leading to the termination of certain jobs We determine the extent to which the current scheduler can handle heterogeneity We further propose GraphIVE (Graph Processing In Varied Environments), a capability-aware graph partitioning policy for GraphLab applications Moreover, GraphIVE continously tries to reach optimum performance via hill climbing We describe how GraphIVE reduces the communication overhead by reducing the replication factor of vertices We implemented a prototype of GraphIVE and present the preliminary results GraphIVE significantly improves the execution time of jobs The results also show how it allows for seamless graph processing on a heterogeneous cluster

...read moreread less

Citations

PDF

Open Access

More filters

Proceedings Article•DOI•

GrapH: Heterogeneity-Aware Graph Computation with Adaptive Partitioning

[...]

Christian Mayer¹, Muhammad Adnan Tariq¹, Chen Li¹, Kurt Rothermel¹•Institutions (1)

University of Stuttgart¹

27 Jun 2016

TL;DR: GrapH is the first graph processing system using vertex-cut graph partitioning that considers both, diverse vertex traffic and heterogeneous network, to minimize overall communication costs and the main idea is to avoid frequent communication over expensive network links using an adaptive edge migration strategy.

...read moreread less

Abstract: Vertex-centric graph processing systems such as Pregel, PowerGraph, or GraphX recently gained popularity due to their superior performance of data analytics on graph-structured data. These systems exploit the graph structure to improve data access locality during computation, making use of specialized graph partitioning algorithms. Recent partitioning techniques assume a uniform and constant amount of data exchanged between graph vertices (i.e., uniform vertex traffic) and homogeneous underlying network costs. However, in real-world scenarios vertex traffic and network costs are heterogeneous. This leads to suboptimal partitioning decisions and inefficient graph processing. To this end, we designed GrapH, the first graph processing system using vertex-cut graph partitioning that considers both, diverse vertex traffic and heterogeneous network, to minimize overall communication costs. The main idea is to avoid frequent communication over expensive network links using an adaptive edge migration strategy. Our evaluations show an improvement of 60% in communication costs compared to state-of-the-art partitioning approaches.

...read moreread less

31 citations

Additional excerpts

...Hotspots arise mainly in three cases: i) the vertices process different amounts of data, ii) the graph system executes vertices a different number of times, and iii) the graph analytic algorithms concentrate on specific graph areas....
[...]

Journal Article•DOI•

GrapH: Traffic-Aware Graph Processing

[...]

Christian Mayer¹, Muhammad Adnan Tariq², Ruben Mayer¹, Kurt Rothermel¹•Institutions (2)

University of Stuttgart¹, National University of Computer and Emerging Sciences²

01 Jun 2018-IEEE Transactions on Parallel and Distributed Systems

TL;DR: GrapH is developed, the first graph processing system using vertex-cut graph partitioning that considers both, diverse vertex traffic and heterogeneous network costs, and the main idea is to avoid frequent communication over expensive network links using an adaptive edge migration strategy.

...read moreread less

Abstract: Distributed graph processing systems such as Pregel, PowerGraph, or GraphX gained popularity due to their superior performance of data analytics on graph-structured data. These systems employ partitioning algorithms to parallelize graph analytics while minimizing inter-partition communication. Recent partitioning algorithms, however, unrealistically assume a uniform and constant amount of data exchanged between graph vertices (i.e., uniform vertex traffic ) and homogeneous network costs between workers hosting the graph partitions. This leads to suboptimal partitioning decisions and inefficient graph processing. To this end, we developed GrapH, the first graph processing system using vertex-cut graph partitioning that considers both, diverse vertex traffic and heterogeneous network costs. The main idea is to avoid frequent communication over expensive network links using an adaptive edge migration strategy. Our evaluations show an improvement of 10 percent in graph processing latency and 60 percent in communication costs compared to state-of-the-art partitioning approaches.

...read moreread less

22 citations

Cites background from "GraphIVE: Heterogeneity-Aware Adapt..."

...GraphIVE [13] strives for a minimal unbalanced k-way vertex-cut for workers with heterogeneous computation and communication capabilities, in...
[...]
...deployment costs and high scalability [5], [12], [13]....
[...]

Journal Article•DOI•

Partitioning big graph with respect to arbitrary proportions in a streaming manner

[...]

Kekun Hu¹, Guosun Zeng¹, Huo-wen Jiang¹, Wei Wang¹•Institutions (1)

Tongji University¹

01 Mar 2018-Future Generation Computer Systems

TL;DR: This work designs 8 streaming heuristics to partitioning a big graph during the process of loading its data from external disks into memory and demonstrates the performance and flexibility of this approach in partitioning real and synthetic graph datasets on a medium-sized cluster.

...read moreread less

11 citations

Proceedings Article•DOI•

GraphSteal: Dynamic Re-Partitioning for Efficient Graph Processing in Heterogeneous Clusters

[...]

Dinesh Kumar¹, Arun Raj², Janakiram Dharanipragada²•Institutions (2)

PayPal¹, Indian Institute of Technology Madras²

01 Jun 2017

TL;DR: GraphSteal, a dynamic graph re-partitioning policy for vertex-cut based graph processing frameworks on heterogeneous clusters, is proposed and Experimental results show that GraphSteal significantly improves the performance over Graphlab.

...read moreread less

Abstract: With continuously growing data, clusters also need to grow periodically to accommodate the increased demand of data processing. This is usually done by addition of newer hardware, whose configuration might differ from the existing nodes. As a result, clusters are becoming heterogeneous in nature. For many real world machine learning and data mining applications, data is represented in the form of graphs. Most of the existing distributed graph processing frameworks such as Pregel and Graphlab assume that the computational nodes are homogeneous. These frameworks split the graph into approximately equal subgraphs, which is appropriate for homogeneous clusters. In heterogeneous clusters, these frameworks perform poorly in most of the scenarios. To the best of our knowledge, GraphIVE is the only heterogeneity-aware graph processing framework. It learns the relative capabilities of the nodes based on runtime metrics of previous jobs and partitions the graph proportionally. However, it may not perform well if a new job differs drastically in terms of resource requirements when compared to previous jobs executed on the cluster. To overcome this limitation, we propose GraphSteal, a dynamic graph re-partitioning policy for vertex-cut based graph processing frameworks on heterogeneous clusters. GraphSteal dynamically re-partitions the graph based on the runtime characteristics of the job. To avoid computational skew in the cluster, it migrates edges from slow nodes to fast nodes. To demonstrate our approach, we modify the source code of Graphlab to incorporate dynamic graph re-partitioning strategy. Experimental results show that GraphSteal significantly improves the performance over Graphlab.

...read moreread less

9 citations

Cites background from "GraphIVE: Heterogeneity-Aware Adapt..."

...Heterogeneity due to hardware is considered in GraphIVE [7]....
[...]
...GraphIVE [7] is a distributed graph processing strategy for heterogeneous clusters....
[...]

Proceedings Article•DOI•

Octopus: A multi-job scheduler for Graphlab

[...]

Srikant Padala¹, Dinesh Kumar¹, Arun Raj¹, Janakiram Dharanipragada¹•Institutions (1)

Indian Institute of Technology Madras¹

29 Oct 2015

TL;DR: Octopus, a fair multi-job scheduler for Graphlab is proposed and preliminary results show that non-preemptive time sharing approach among users exhibits significant gain in turnaround time when compared to spatial resource sharing.

...read moreread less

Abstract: Graphlab, which is a framework for large graph processing currently does not support multiple job scheduling simultaneously. However, for efficient use of the cluster resources, it may be required to share the cluster among multiple jobs. The challenges in multi-job scheduling in the case of graph processing are different from other frameworks such as Hadoop. In Hadoop, it is possible to schedule multiple jobs by fairly allocating resources to the jobs. We show in this paper that such an approach does not provide optimal results in the case of graph processing. We propose Octopus, a fair multi-job scheduler for Graphlab. The scheduler uses two different algorithms viz., First Fit with round robin Filling (FFF) and First In First Out with round robin Filling (FIFOF) to schedule large jobs of a user. We compare the performance of both the algorithms on a 20-node cluster. Preliminary results show that non-preemptive time sharing approach among users exhibits significant gain in turnaround time when compared to spatial resource sharing.

...read moreread less

3 citations

Cites background from "GraphIVE: Heterogeneity-Aware Adapt..."

...The node capability evaluation has been studied in GraphIVE [15]....
[...]

References

PDF

Open Access

More filters

Journal Article•DOI•

MapReduce: simplified data processing on large clusters

[...]

Jeffrey Dean¹, Sanjay Ghemawat¹•Institutions (1)

Google¹

06 Dec 2004

TL;DR: This paper presents the implementation of MapReduce, a programming model and an associated implementation for processing and generating large data sets that runs on a large cluster of commodity machines and is highly scalable.

...read moreread less

Abstract: MapReduce is a programming model and an associated implementation for processing and generating large data sets. Users specify a map function that processes a key/value pair to generate a set of intermediate key/value pairs, and a reduce function that merges all intermediate values associated with the same intermediate key. Many real world tasks are expressible in this model, as shown in the paper. Programs written in this functional style are automatically parallelized and executed on a large cluster of commodity machines. The run-time system takes care of the details of partitioning the input data, scheduling the program's execution across a set of machines, handling machine failures, and managing the required inter-machine communication. This allows programmers without any experience with parallel and distributed systems to easily utilize the resources of a large distributed system. Our implementation of MapReduce runs on a large cluster of commodity machines and is highly scalable: a typical MapReduce computation processes many terabytes of data on thousands of machines. Programmers find the system easy to use: hundreds of MapReduce programs have been implemented and upwards of one thousand MapReduce jobs are executed on Google's clusters every day.

...read moreread less

20,309 citations

"GraphIVE: Heterogeneity-Aware Adapt..." refers background in this paper

...Most of these focus on the MapReduce[6] framework....
[...]
...However, the ideas suggested for MapReduce are not suitable for GraphLab....
[...]

Journal Article•DOI•

MapReduce: simplified data processing on large clusters

[...]

Jeffrey Dean¹, Sanjay Ghemawat¹•Institutions (1)

Google¹

01 Jan 2008-Communications of The ACM

TL;DR: This presentation explains how the underlying runtime system automatically parallelizes the computation across large-scale clusters of machines, handles machine failures, and schedules inter-machine communication to make efficient use of the network and disks.

...read moreread less

Abstract: MapReduce is a programming model and an associated implementation for processing and generating large datasets that is amenable to a broad variety of real-world tasks. Users specify the computation in terms of a map and a reduce function, and the underlying runtime system automatically parallelizes the computation across large-scale clusters of machines, handles machine failures, and schedules inter-machine communication to make efficient use of the network and disks. Programmers find the system easy to use: more than ten thousand distinct MapReduce programs have been implemented internally at Google over the past four years, and an average of one hundred thousand MapReduce jobs are executed on Google's clusters every day, processing a total of more than twenty petabytes of data per day.

...read moreread less

17,663 citations

Book•

Artificial Intelligence: A Modern Approach

[...]

Stuart Russell¹, Peter Norvig²•Institutions (2)

University of California, Berkeley¹, University of Southern California²

01 Jan 2020

TL;DR: In this article, the authors present a comprehensive introduction to the theory and practice of artificial intelligence for modern applications, including game playing, planning and acting, and reinforcement learning with neural networks.

...read moreread less

Abstract: The long-anticipated revision of this #1 selling book offers the most comprehensive, state of the art introduction to the theory and practice of artificial intelligence for modern applications. Intelligent Agents. Solving Problems by Searching. Informed Search Methods. Game Playing. Agents that Reason Logically. First-order Logic. Building a Knowledge Base. Inference in First-Order Logic. Logical Reasoning Systems. Practical Planning. Planning and Acting. Uncertainty. Probabilistic Reasoning Systems. Making Simple Decisions. Making Complex Decisions. Learning from Observations. Learning with Neural Networks. Reinforcement Learning. Knowledge in Learning. Agents that Communicate. Practical Communication in English. Perception. Robotics. For computer professionals, linguists, and cognitive scientists interested in artificial intelligence.

...read moreread less

16,983 citations

Proceedings Article•DOI•

What is Twitter, a social network or a news media?

[...]

Haewoon Kwak¹, Changhyun Lee¹, Hosung Park¹, Sue Moon¹•Institutions (1)

KAIST¹

26 Apr 2010

TL;DR: In this paper, the authors have crawled the entire Twittersphere and found a non-power-law follower distribution, a short effective diameter, and low reciprocity, which all mark a deviation from known characteristics of human social networks.

...read moreread less

Abstract: Twitter, a microblogging service less than three years old, commands more than 41 million users as of July 2009 and is growing fast. Twitter users tweet about any topic within the 140-character limit and follow others to receive their tweets. The goal of this paper is to study the topological characteristics of Twitter and its power as a new medium of information sharing.We have crawled the entire Twitter site and obtained 41.7 million user profiles, 1.47 billion social relations, 4,262 trending topics, and 106 million tweets. In its follower-following topology analysis we have found a non-power-law follower distribution, a short effective diameter, and low reciprocity, which all mark a deviation from known characteristics of human social networks [28]. In order to identify influentials on Twitter, we have ranked users by the number of followers and by PageRank and found two rankings to be similar. Ranking by retweets differs from the previous two rankings, indicating a gap in influence inferred from the number of followers and that from the popularity of one's tweets. We have analyzed the tweets of top trending topics and reported on their temporal behavior and user participation. We have classified the trending topics based on the active period and the tweets and show that the majority (over 85%) of topics are headline news or persistent news in nature. A closer look at retweets reveals that any retweeted tweet is to reach an average of 1,000 users no matter what the number of followers is of the original tweet. Once retweeted, a tweet gets retweeted almost instantly on next hops, signifying fast diffusion of information after the 1st retweet.To the best of our knowledge this work is the first quantitative study on the entire Twittersphere and information diffusion on it.

...read moreread less

6,108 citations

Proceedings Article•DOI•

Pregel: a system for large-scale graph processing

[...]

Grzegorz Malewicz, Matthew H. Austern¹, Aart J. C. Bik¹, James C. Dehnert¹, Ilan Horn¹, Naty Leiser¹, Grzegorz Czajkowski¹ - Show less +3 more•Institutions (1)

Google¹

06 Jun 2010

TL;DR: A model for processing large graphs that has been designed for efficient, scalable and fault-tolerant implementation on clusters of thousands of commodity computers, and its implied synchronicity makes reasoning about programs easier.

...read moreread less

Abstract: Many practical computing problems concern large graphs. Standard examples include the Web graph and various social networks. The scale of these graphs - in some cases billions of vertices, trillions of edges - poses challenges to their efficient processing. In this paper we present a computational model suitable for this task. Programs are expressed as a sequence of iterations, in each of which a vertex can receive messages sent in the previous iteration, send messages to other vertices, and modify its own state and that of its outgoing edges or mutate graph topology. This vertex-centric approach is flexible enough to express a broad set of algorithms. The model has been designed for efficient, scalable and fault-tolerant implementation on clusters of thousands of commodity computers, and its implied synchronicity makes reasoning about programs easier. Distribution-related details are hidden behind an abstract API. The result is a framework for processing large graphs that is expressive and easy to program.

...read moreread less

3,840 citations

"GraphIVE: Heterogeneity-Aware Adapt..." refers background in this paper

...Pregel[1] is a message passing system where computation proceeds in a sequence of super-steps on all vertices....
[...]
...The demand for speed and accuracy has resulted in the development of several graph-parallel abstractions like Pregel[1], GraphLab[2] and PowerGraph[3]....
[...]
...Mizan[18] is also a Pregel system with runtime monitoring for adaptive load balancing....
[...]
...GPS[17] is similar to Pregel, but has a dynamic repartitioning scheme for load balancing....
[...]