scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

DynamicDFEP: A Distributed Edge Partitioning Approach for Large Dynamic Graphs

11 Jul 2016-pp 142-147
TL;DR: This paper proposes a graph partitioning method for large dynamic graphs, an implementation of the proposed approach on top of the AKKA framework is presented, and it is experimentally shown that the approach is efficient in the case of largeynamic graphs.
Abstract: Distributed graph processing has become a very popular research topic recently, particularly in domains such as the analysis of social networks, web graphs and spatial networks. In this context, graph partitioning is an important task. Several partitioning algorithms have been proposed, such as DFEP, JABEJA and POWERGRAPH, but they are limited to static graphs only. In fact, they do not consider dynamic graphs in which vertices and edges are added and/or removed. In this paper, we propose a graph partitioning method for large dynamic graphs. We present an implementation of the proposed approach on top of the AKKA framework, and we experimentally show that our approach is efficient in the case of large dynamic graphs.
Citations
More filters
Journal ArticleDOI
TL;DR: This paper presents bladyg, a graph processing framework that addresses the issue of dynamism in large-scale graphs and experimentally evaluates the performance of the proposed framework by applying it to problems such as distributed k-core decomposition and partitioning of large dynamic graphs.

25 citations


Cites methods from "DynamicDFEP: A Distributed Edge Par..."

  • ...In this section, we apply bladyg to solve some classic graph operations such as k–core decomposition [17] [3], clique computation [28] and graph partitioning [10] [20]....

    [...]

  • ...For our tests, we used the graph datasets described in Table 1 and we considered three partitioning techniques (1) hash partitioning, (2) random partitioning and (3) DynamicDFEP, a previously published distributed partitioning algorithm [20]....

    [...]

  • ...For our tests, we used the UnitBased Update strategy (UB-Update) described in [20]....

    [...]

Journal ArticleDOI
TL;DR: Bladyg as discussed by the authors is a block-centric framework that addresses the issue of scale and dynamism in large-scale graphs, and it is implemented on top of the Akka framework.
Abstract: Recently, distributed processing of large dynamic graphs has become very popular, especially in certain domains such as social network analysis, Web graph analysis and spatial network analysis. In this context, many distributed/parallel graph processing systems have been proposed, such as Pregel, GraphLab, and Trinity. These systems can be divided into two categories: (1) vertex-centric and (2) block-centric approaches. In vertex-centric approaches, each vertex corresponds to a process, and message are exchanged among vertices. In block-centric approaches, the unit of computation is a block, a connected subgraph of the graph, and message exchanges occur among blocks. In this paper, we are considering the issues of scale and dynamism in the case of block-centric approaches. We present bladyg, a block-centric framework that addresses the issue of dynamism in large-scale graphs. We present an implementation of BLADYG on top of akka framework. We experimentally evaluate the performance of the proposed framework.

23 citations

Journal ArticleDOI
TL;DR: A quick reference guide to recent engineering and theory results in the area of fully dynamic graph algorithms.
Abstract: In recent years, significant advances have been made in the design and analysis of fully dynamic algorithms. However, these theoretical results have received very little attention from the practical perspective. Few of the algorithms are implemented and tested on real datasets, and their practical potential is far from understood. Here, we present a quick reference guide to recent engineering and theory results in the area of fully dynamic graph algorithms.

8 citations

Journal ArticleDOI
TL;DR: In this paper , the authors present an overview, classification, and investigation of the most popular graph partitioning and computing systems and discuss future challenges and research directions in graph partitions and computing.
Abstract: Graphs are a tremendously suitable data representation that models the relationships of entities in many application domains, such as recommendation systems, machine learning, computational biology, social network analysis, and other application domains. Graphs with many vertices and edges have become quite prevalent in recent years. Therefore, graph computing systems with integrated various graph partitioning techniques have been envisioned as a promising paradigm to handle large-scale graph analytics in these application domains. However, scalable processing of large-scale graphs is challenging due to their high volume and inherent irregular structure of the real-world graphs. Hence, industry and academia have recently proposed graph partitioning and computing systems to efficiently process and analyze large-scale graphs. The graph partitioning and computing systems have been designed to improve scalability issues and reduce processing time complexity. This paper presents an overview, classification, and investigation of the most popular graph partitioning and computing systems. The various methods and approaches of graph partitioning and diverse categories of graph computing systems are presented. Finally, we discuss future challenges and research directions in graph partitioning and computing systems.

4 citations

Posted Content
TL;DR: A quick reference guide to recent engineering and theory results in the area of fully dynamic graph algorithms can be found in this article, with a focus on the practical potential of the proposed algorithms.
Abstract: In recent years, significant advances have been made in the design and analysis of fully dynamic algorithms. However, these theoretical results have received very little attention from the practical perspective. Few of the algorithms are implemented and tested on real datasets, and their practical potential is far from understood. Here, we present a quick reference guide to recent engineering and theory results in the area of fully dynamic graph algorithms.

4 citations

References
More filters
Journal ArticleDOI
TL;DR: This work presents a new coarsening heuristic (called heavy-edge heuristic) for which the size of the partition of the coarse graph is within a small factor of theSize of the final partition obtained after multilevel refinement, and presents a much faster variation of the Kernighan--Lin (KL) algorithm for refining during uncoarsening.
Abstract: Recently, a number of researchers have investigated a class of graph partitioning algorithms that reduce the size of the graph by collapsing vertices and edges, partition the smaller graph, and then uncoarsen it to construct a partition for the original graph [Bui and Jones, Proc. of the 6th SIAM Conference on Parallel Processing for Scientific Computing, 1993, 445--452; Hendrickson and Leland, A Multilevel Algorithm for Partitioning Graphs, Tech. report SAND 93-1301, Sandia National Laboratories, Albuquerque, NM, 1993]. From the early work it was clear that multilevel techniques held great promise; however, it was not known if they can be made to consistently produce high quality partitions for graphs arising in a wide range of application domains. We investigate the effectiveness of many different choices for all three phases: coarsening, partition of the coarsest graph, and refinement. In particular, we present a new coarsening heuristic (called heavy-edge heuristic) for which the size of the partition of the coarse graph is within a small factor of the size of the final partition obtained after multilevel refinement. We also present a much faster variation of the Kernighan--Lin (KL) algorithm for refining during uncoarsening. We test our scheme on a large number of graphs arising in various domains including finite element methods, linear programming, VLSI, and transportation. Our experiments show that our scheme produces partitions that are consistently better than those produced by spectral partitioning schemes in substantially smaller time. Also, when our scheme is used to compute fill-reducing orderings for sparse matrices, it produces orderings that have substantially smaller fill than the widely used multiple minimum degree algorithm.

5,629 citations


"DynamicDFEP: A Distributed Edge Par..." refers methods in this paper

  • ...metis [5] is a vertex partitioning algorithm based on the Multilevel Graph Partitioning concept [4]....

    [...]

Proceedings ArticleDOI
06 Jun 2010
TL;DR: A model for processing large graphs that has been designed for efficient, scalable and fault-tolerant implementation on clusters of thousands of commodity computers, and its implied synchronicity makes reasoning about programs easier.
Abstract: Many practical computing problems concern large graphs. Standard examples include the Web graph and various social networks. The scale of these graphs - in some cases billions of vertices, trillions of edges - poses challenges to their efficient processing. In this paper we present a computational model suitable for this task. Programs are expressed as a sequence of iterations, in each of which a vertex can receive messages sent in the previous iteration, send messages to other vertices, and modify its own state and that of its outgoing edges or mutate graph topology. This vertex-centric approach is flexible enough to express a broad set of algorithms. The model has been designed for efficient, scalable and fault-tolerant implementation on clusters of thousands of commodity computers, and its implied synchronicity makes reasoning about programs easier. Distribution-related details are hidden behind an abstract API. The result is a framework for processing large graphs that is expressive and easy to program.

3,840 citations


"DynamicDFEP: A Distributed Edge Par..." refers background in this paper

  • ...2938506 different partitions are called cut edges and may be considered as ”communication channels” that the nodes will use to coordinate the different partitions [8]....

    [...]

  • ...Consequently, largescale distributed/parallel frameworks such as pregel [8], graphlab [7] and giraph [3] have emerged....

    [...]

01 Jun 2014
TL;DR: A collection of more than 50 large network datasets from tens of thousands of node and edges to tens of millions of nodes and edges that includes social networks, web graphs, road networks, internet networks, citation networks, collaboration networks, and communication networks.
Abstract: A collection of more than 50 large network datasets from tens of thousands of nodes and edges to tens of millions of nodes and edges. In includes social networks, web graphs, road networks, internet networks, citation networks, collaboration networks, and communication networks.

3,135 citations


"DynamicDFEP: A Distributed Edge Par..." refers methods in this paper

  • ...The used datasets are made available by the Stanford Large Network Dataset collection [6]....

    [...]

Proceedings ArticleDOI
08 Oct 2012
TL;DR: This paper describes the challenges of computation on natural graphs in the context of existing graph-parallel abstractions and introduces the PowerGraph abstraction which exploits the internal structure of graph programs to address these challenges.
Abstract: Large-scale graph-structured computation is central to tasks ranging from targeted advertising to natural language processing and has led to the development of several graph-parallel abstractions including Pregel and GraphLab. However, the natural graphs commonly found in the real-world have highly skewed power-law degree distributions, which challenge the assumptions made by these abstractions, limiting performance and scalability.In this paper, we characterize the challenges of computation on natural graphs in the context of existing graph-parallel abstractions. We then introduce the PowerGraph abstraction which exploits the internal structure of graph programs to address these challenges. Leveraging the PowerGraph abstraction we introduce a new approach to distributed graph placement and representation that exploits the structure of power-law graphs. We provide a detailed analysis and experimental evaluation comparing PowerGraph to two popular graph-parallel systems. Finally, we describe three different implementation strategies for PowerGraph and discuss their relative merits with empirical evaluations on large-scale real-world problems demonstrating order of magnitude gains.

1,710 citations


"DynamicDFEP: A Distributed Edge Par..." refers background in this paper

  • ...powergraph [1] presents a greedy approach that partitions a stream of edges....

    [...]

Journal ArticleDOI
01 Apr 2012
TL;DR: GraphLab as discussed by the authors extends the GraphLab framework to the substantially more challenging distributed setting while preserving strong data consistency guarantees to reduce network congestion and mitigate the effect of network latency in the shared-memory setting.
Abstract: While high-level data parallel frameworks, like MapReduce, simplify the design and implementation of large-scale data processing systems, they do not naturally or efficiently support many important data mining and machine learning algorithms and can lead to inefficient learning systems. To help fill this critical void, we introduced the GraphLab abstraction which naturally expresses asynchronous, dynamic, graph-parallel computation while ensuring data consistency and achieving a high degree of parallel performance in the shared-memory setting. In this paper, we extend the GraphLab framework to the substantially more challenging distributed setting while preserving strong data consistency guarantees.We develop graph based extensions to pipelined locking and data versioning to reduce network congestion and mitigate the effect of network latency. We also introduce fault tolerance to the GraphLab abstraction using the classic Chandy-Lamport snapshot algorithm and demonstrate how it can be easily implemented by exploiting the GraphLab abstraction itself. Finally, we evaluate our distributed implementation of the GraphLab abstraction on a large Amazon EC2 deployment and show 1-2 orders of magnitude performance gains over Hadoop-based implementations.

1,505 citations