Showing papers on "Wait-for graph published in 2017"

PDF

Open Access

Proceedings Article•DOI•

To Push or To Pull: On Reducing Communication and Synchronization in Graph Computations

[...]

Maciej Besta¹, Michal Podstawski, Linus Groner¹, Edgar Solomonik², Torsten Hoefler¹ - Show less +1 more•Institutions (2)

ETH Zurich¹, University of Illinois at Urbana–Champaign²

26 Jun 2017

TL;DR: In this paper, the authors investigate the applicability of push-pull dichotomy to various algorithms and its impact on complexity, performance, and the amount of used locks, atomics, and reads/writes.

...read moreread less

Abstract: We reduce the cost of communication and synchronization in graph processing by analyzing the fastest way to process graphs: pushing the updates to a shared state or pulling the updates to a private state. We investigate the applicability of this push-pull dichotomy to various algorithms and its impact on complexity, performance, and the amount of used locks, atomics, and reads/writes. We consider 11 graph algorithms, 3 programming models, 2 graph abstractions, and various families of graphs. The conducted analysis illustrates surprising differences between push and pull variants of different algorithms in performance, speed of convergence, and code complexity; the insights are backed up by performance data from hardware counters. We use these findings to illustrate which variant is faster for each algorithm and to develop generic strategies that enable even higher speedups. Our insights can be used to accelerate graph processing engines or libraries on both massively-parallel shared-memory machines as well as distributed-memory systems.

...read moreread less

124 citations

Journal Article•DOI•

Sign-Consensus of Linear Multi-Agent Systems Over Signed Directed Graphs

[...]

Ye Jiang¹, Hongwei Zhang¹, Jie Chen²•Institutions (2)

Southwest Jiaotong University¹, City University of Hong Kong²

01 Jun 2017-IEEE Transactions on Industrial Electronics

TL;DR: Signed graph-based multi-agent systems provide models to opinion dynamics and social networks, and may also hold significance in further developing such internet search algorithms as PageRank to counter spamming websites.

...read moreread less

Abstract: This paper investigates sign-consensus problems of general linear multi-agent systems. The interaction between agents is modeled by a signed directed graph, where both cooperation and competition coexist within a group. The graph is allowed to be structurally unbalanced and its adjacency matrix is assumed to be eventually positive. Distributed control laws are proposed for several classes of graph topologies. Simulation examples are provided to illustrate the proposed control laws. Signed graph-based multi-agent systems provide models to opinion dynamics and social networks, and may also hold significance in further developing such internet search algorithms as PageRank to counter spamming websites.

...read moreread less

93 citations

Journal Article•DOI•

gMark: Schema-Driven Generation of Graphs and Queries

[...]

Guillaume Bagan¹, Angela Bonifati¹, Radu Ciucanu², George H. L. Fletcher³, Aurélien Lemay⁴, Nicky Advokaat³ - Show less +2 more•Institutions (4)

University of Lyon¹, Blaise Pascal University², Eindhoven University of Technology³, university of lille⁴

01 Apr 2017

TL;DR: GMark as discussed by the authors is a domain and query language-independent graph instance and query workload generator that targets and controls the diversity of properties of both the generated instances and the generated workloads coupled to these instances.

...read moreread less

Abstract: Massive graph data sets are pervasive in contemporary application domains. Hence, graph database systems are becoming increasingly important. In the experimental study of these systems, it is vital that the research community has shared solutions for the generation of database instances and query workloads having predictable and controllable properties. In this paper, we present the design and engineering principles of $\mathsf {gMark}$ , a domain- and query language-independent graph instance and query workload generator. A core contribution of $\mathsf {gMark}$ is its ability to target and control the diversity of properties of both the generated instances and the generated workloads coupled to these instances. Further novelties include support for regular path queries, a fundamental graph query paradigm, and schema-driven selectivity estimation of queries, a key feature in controlling workload chokepoints. We illustrate the flexibility and practical usability of $\mathsf {gMark}$ by showcasing the framework's capabilities in generating high quality graphs and workloads, and its ability to encode user-defined schemas across a variety of application domains.

...read moreread less

71 citations

Journal Article•DOI•

ExtraV: boosting graph processing near storage with a coherent accelerator

[...]

Jinho Lee¹, Heesu Kim², Sungjoo Yoo², Kiyoung Choi², H. Peter Hofstee³, Gi-Joon Nam¹, Mark Richard Nutter¹, Damir A. Jamsek¹ - Show less +4 more•Institutions (3)

IBM¹, Seoul National University², Delft University of Technology³

01 Aug 2017

TL;DR: ExtraV is a framework for near-storage graph processing based on the novel concept of graph virtualization, which efficiently utilizes a cache-coherent hardware accelerator at the storage side to achieve performance and flexibility at the same time.

...read moreread less

Abstract: In this paper, we propose ExtraV, a framework for near-storage graph processing. It is based on the novel concept of graph virtualization, which efficiently utilizes a cache-coherent hardware accelerator at the storage side to achieve performance and flexibility at the same time. ExtraV consists of four main components: 1) host processor, 2) main memory, 3) AFU (Accelerator Function Unit) and 4) storage. The AFU, a hardware accelerator, sits between the host processor and storage. Using a coherent interface that allows main memory accesses, it performs graph traversal functions that are common to various algorithms while the program running on the host processor (called the host program) manages the overall execution along with more application-specific tasks. Graph virtualization is a high-level programming model of graph processing that allows designers to focus on algorithm-specific functions. Realized by the accelerator, graph virtualization gives the host programs an illusion that the graph data reside on the main memory in a layout that fits with the memory access behavior of host programs even though the graph data are actually stored in a multi-level, compressed form in storage. We prototyped ExtraV on a Power8 machine with a CAPI-enabled FPGA. Our experiments on a real system prototype offer significant speedup compared to state-of-the-art software only implementations.

...read moreread less

61 citations

Proceedings Article•

Everything you always wanted to know about multicore graph processing but were afraid to ask

[...]

Jasmina Malicevic¹, Baptiste Lepers¹, Willy Zwaenepoel¹•Institutions (1)

École Polytechnique Fédérale de Lausanne¹

12 Jul 2017

TL;DR: It is demonstrated that NUMA-awareness and its attendant pre-processing costs are beneficial only on large machines and for certain algorithms, calling into question the benefits of proposed algorithmic optimizations that rely on extensive preprocessing.

...read moreread less

Abstract: Graph processing systems are used in a wide variety of fields, ranging from biology to social networks, and a large number of such systems have been described in the recent literature. We perform a systematic comparison of various techniques proposed to speed up in-memory multicore graph processing. In addition, we take an end-to-end view of execution time, including not only algorithm execution time, but also pre-processing time and the time to load the graph input data from storage. More specifically, we study various data structures to represent the graph in memory, various approaches to pre-processing and various ways to structure the graph computation. We also investigate approaches to improve cache locality, synchronization, and NUMA-awareness. In doing so, we take our inspiration from a number of graph processing systems, and implement the techniques they propose in a single system. We then selectively enable different techniques, allowing us to assess their benefits in isolation and independent of unrelated implementation considerations. Our main observation is that the cost of pre-processing in many circumstances dominates the cost of algorithm execution, calling into question the benefits of proposed algorithmic optimizations that rely on extensive preprocessing. Equally surprising, using radix sort turns out to be the most efficient way of pre-processing the graph input data into adjacency lists, when the graph input data is already in memory or is loaded from fast storage. Furthermore, we adapt a technique developed for out-of-core graph processing, and show that it significantly improves cache locality. Finally, we demonstrate that NUMA-awareness and its attendant pre-processing costs are beneficial only on large machines and for certain algorithms.

...read moreread less

58 citations

Proceedings Article•DOI•

All-in-One: Graph Processing in RDBMSs Revisited

[...]

Kangfei Zhao¹, Jeffrey Xu Yu¹•Institutions (1)

The Chinese University of Hong Kong¹

09 May 2017

TL;DR: This work revisits SQL recursive queries and shows that the 4 operations with others are ensured to have a fixpoint, following the techniques studied in DATALOG, and enhances the recursive WITH clause in SQL'99.

...read moreread less

Abstract: To support analytics on massive graphs such as online social networks, RDF, Semantic Web, etc. many new graph algorithms are designed to query graphs for a specific problem, and many distributed graph processing systems are developed to support graph querying by programming. In this paper, we focus on RDBM, which has been well studied over decades to manage large datasets, and we revisit the issue how RDBM can support graph processing at the SQL level. Our work is motivated by the fact that there are many relations stored in RDBM that are closely related to a graph in real applications and need to be used together to query the graph, and RDBM is a system that can query and manage data while data may be updated over time. To support graph processing, in this work, we propose 4 new relational algebra operations, MM-join, MV-join, anti-join, and union-by-update. Here, MM-join and MV-join are join operations between two matrices and between a matrix and a vector, respectively, followed by aggregation computing over groups, given a matrix/vector can be represented by a relation. Both deal with the semiring by which many graph algorithms can be supported. The anti-join removes nodes/edges in a graph when they are unnecessary for the following computing. The union-by-update addresses value updates to compute PageRank, for example. The 4 new relational algebra operations can be defined by the 6 basic relational algebra operations with group-by & aggregation. We revisit SQL recursive queries and show that the 4 operations with others are ensured to have a fixpoint, following the techniques studied in DATALOG, and enhance the recursive WITH clause in SQL'99. We conduct extensive performance studies to test 10 graph algorithms using 9 large real graphs in 3 major RDBMs. We show that RDBMs are capable of dealing with graph processing in reasonable time. The focus of this work is at SQL level. There is high potential to improve the efficiency by main-memory RDBMs, efficient join processing in parallel, and new storage management.

...read moreread less

45 citations

Proceedings Article•DOI•

Breaking Cycles In Noisy Hierarchies

[...]

Jiankai Sun¹, Deepak Ajwani², Patrick K. Nicholson², Alessandra Sala², Srinivasan Parthasarathy¹ - Show less +1 more•Institutions (2)

Ohio State University¹, Bell Labs²

25 Jun 2017

TL;DR: This paper addresses the problem of breaking cycles while preserving the logical structure (hierarchy) of a directed graph as much as possible, and infers graph hierarchy using a range of features, including a Bayesian skill rating system and a social agony metric.

...read moreread less

Abstract: Taxonomy graphs that capture hyponymy or meronymy relationships through directed edges are expected to be acyclic. However, in practice, they may have thousands of cycles, as they are often created in a crowd-sourced way. Since these cycles represent logical fallacies, they need to be removed for many web applications. In this paper, we address the problem of breaking cycles while preserving the logical structure (hierarchy) of a directed graph as much as possible. Existing approaches for this problem either need manual intervention or use heuristics that can critically alter the taxonomy structure. In contrast, our approach infers graph hierarchy using a range of features, including a Bayesian skill rating system and a social agony metric. We also devise several strategies to leverage the inferred hierarchy for removing a small subset of edges to make the graph acyclic. Extensive experiments demonstrate the effectiveness of our approach.

...read moreread less

31 citations

Proceedings Article•DOI•

Do We Need Specialized Graph Databases?: Benchmarking Real-Time Social Networking Applications

[...]

Anil Pacaci¹, Alice Zhou¹, Jimmy Lin¹, M. Tamer Özsu¹•Institutions (1)

University of Waterloo¹

19 May 2017

TL;DR: This paper presents an graph database benchmarking architecture built on the existing LDBC Social Network Benchmark, and evaluates a selection of specialized graph databases, RDF stores, and RDBMSes adapted for graphs to find that specializedgraph databases provide definitively better performance.

...read moreread less

Abstract: With the advent of online social networks, there is an increasing demand for storage and processing of graph-structured data. Social networking applications pose new challenges to data management systems due to demand for real-time querying and manipulation of the graph structure. Recently, several systems specialized systems for graph-structured data have been introduced. However, whether we should abandon mature RDBMS technology for graph databases remains an ongoing discussion. In this paper we present an graph database benchmarking architecture built on the existing LDBC Social Network Benchmark. Our proposed architecture stresses the systems with an interactive transactional workload to better simulate the real-time nature of social networking applications. Using this improved architecture, we evaluated a selection of specialized graph databases, RDF stores, and RDBMSes adapted for graphs. We do not find that specialized graph databases provide definitively better performance.

...read moreread less

30 citations

Journal Article•DOI•

Integrity constraints in graph databases

[...]

Jaroslav Pokorný¹, Michal Valenta², Jiří Kovačič²•Institutions (2)

Charles University in Prague¹, Czech Technical University in Prague²

01 Jan 2017-Procedia Computer Science

TL;DR: This paper focuses on graph database Neo4j and its possibilities to express a database schema and ICs and extends these possibilities through new constructs in Neo4J DDL including their prototype implementation and experiments.

...read moreread less

26 citations

Journal Article•DOI•

Deadlock detection in complex software systems specified through graph transformation using Bayesian optimization algorithm

[...]

Einollah Pira¹, Vahid Rafe¹, Amin Nikanjam²•Institutions (2)

Arak University¹, K.N.Toosi University of Technology²

01 Sep 2017-Journal of Systems and Software

TL;DR: Experimental results show that the proposed approach is faster and more accurate than existing algorithms in discovering deadlock states in the most of case studies with large state spaces.

...read moreread less

25 citations

Proceedings Article•DOI•

GraphGen: Adaptive Graph Processing using Relational Databases

[...]

Konstantinos Xirogiannopoulos¹, Virinchi Srinivas¹, Amol Deshpande¹•Institutions (1)

University of Maryland, College Park¹

19 May 2017

TL;DR: An end-to-end graph analysis framework, called GraphGen, that sits atop an RDBMS, and supports graph querying/analytics through defining graphs as transformations over underlying relational datasets (as Graph-Views), and providing the ability to write arbitrary programs against the graphs.

...read moreread less

Abstract: Graph querying and analytics are becoming an increasingly important component of the arsenal of tools for extracting different kinds of insights from data. Despite an immense amount of work on those topics, graphs are largely still handled in an ad hoc manner, in part because most data continues to reside in relational-like data management systems, and because graph analytics/querying typically forms a small portion of the overall analysis pipelines. In this paper we describe an end-to-end graph analysis framework, called GraphGen, that sits atop an RDBMS, and supports graph querying/analytics through: (a) defining graphs as transformations over underlying relational datasets (as Graph-Views) and (b) specifying queries or analytics on those graphs using either a high-level language or Java programs against a simple graph API. Although conceptually simple, GraphGen acts as an abstraction/independence layer that opens up many opportunities for adaptively optimizing graph analysis workflows, since the system can decide where to execute tasks on a per-task basis (in database or outside), how much of the graph to materialize in memory, and what types of in-memory representations to use (especially critical when the graphs are larger than the input datasets, as is often the case). At the same time, by providing the ability to write arbitrary programs against the graphs, GraphGen removes a major expressivity limitation of many existing graph analysis systems, which only support limited programming frameworks. We describe the GraphGen DSL, loosely based on Datalog, that includes both graph specification and in-line analysis capabilities. We then discuss many optimization challenges in building GraphGen, that we are currently working on addressing.

...read moreread less

Proceedings Article•DOI•

Graph Querying Meets HCI: State of the Art and Future Directions

[...]

Sourav S. Bhowmick¹, Byron Choi², Chengkai Li³•Institutions (3)

Nanyang Technological University¹, Hong Kong Baptist University², University of Texas at Arlington³

09 May 2017

TL;DR: This tutorial reviews and summarizes the research thus far into HCI and graph querying in the database community, giving researchers a snapshot of the current state of the art in this topic, and future research directions.

...read moreread less

Abstract: Querying graph databases has emerged as an important research problem for real-world applications that center on large graph data. Given the syntactic complexity of graph query languages (e.g., SPARQL, Cypher), visual graph query interfaces make it easy for non-expert users to query such graph data repositories. In this tutorial, we survey recent developments in the emerging area of visual graph querying paradigm that bridges traditional graph querying with human computer interaction (HCI). We discuss manual and data-driven visual graph query interfaces, various strategies and guidance for constructing graph queries visually, interleaving processing of graph queries and visual actions, and visual exploration of graph query results. In addition, the tutorial suggests open problems and new research directions. In summary, in this tutorial we review and summarize the research thus far into HCI and graph querying in the database community, giving researchers a snapshot of the current state of the art in this topic, and future research directions.

...read moreread less

Proceedings Article•

On the Performance of Analytical and Pattern Matching Graph Queries in Neo4j and a Relational Database.

[...]

Jürgen Hölsch¹, Tobias Schmidt, Michael Grossniklaus¹•Institutions (1)

University of Konstanz¹

01 Jan 2017

TL;DR: It is shown that the relational database system outperforms Neo4j for analytical queries and that Neo4J is faster for queries that do not filter on specific edge types.

...read moreread less

Abstract: Graph databases with a custom non-relational backend promote themselves to outperform relational databases in answering queries on large graphs. Recent empirical studies show that this claim is not always true. However, these studies focus only on pattern matching queries and neglect analytical queries used in practice such as shortest path, diameter, degree centrality or closeness centrality. In addition, there is no distinction between different types of pattern matching queries. In this paper, we introduce a set of analytical and pattern matching queries, and evaluate them in Neo4j and a market-leading commercial relational database system. We show that the relational database system outperforms Neo4j for our analytical queries and that Neo4j is faster for queries that do not filter on specific edge types.

...read moreread less

Journal Article•DOI•

Design of improved optimal and suboptimal deadlock prevention for flexible manufacturing systems based on place invariant and reachability graph analysis methods

[...]

Yen-Liang Pan¹, Ching-Yun Tseng¹, Ter-Chan Row²•Institutions (2)

United States Air Force Academy¹, Army and Navy Academy²

30 May 2017-Journal of Algorithms & Computational Technology

TL;DR: Experimental results show that the proposed improved policy does improve the drawback of conventional maximal number of forbidding First Bad Marking (FBM) problem technology and can be used in all kinds of nets.

...read moreread less

Abstract: Flexible manufacturing systems exhibit a high degree of resource sharing Since the parts advancing through the system compete for a finite number of resources, a deadlock may occur Accordingly, m

...read moreread less

Proceedings Article•DOI•

Bidirectional value driven design between economical planning and technical implementation based on data graph, information graph and knowledge graph

[...]

Lixu Shao¹, Yucong Duan¹, Xiaobing Sun², Quan Zou³, Rongqi Jing¹, Jiami Lin¹ - Show less +2 more•Institutions (3)

Hainan University¹, Yangzhou University², Tianjin University³

01 Jun 2017

TL;DR: This work proposes to bridge bidirectional value driven design between economic planning and technology implementation on the basis of the data graph, information graph and knowledge graph to improve system reliability and robustness by managing data and information reuse, redundancy as well as structure.

...read moreread less

Abstract: Value-Driven Design enables rational decisions to be made in terms of the optimum business and technical solution at every level of engineering design by employing economics in decision making. In order to maximize the business profitability, we propose to bridge bidirectional value driven design between economic planning and technology implementation on the basis of the data graph, information graph and knowledge graph. We use data graph, information graph and knowledge graph to analyze problems that have negative impact on activities of software development including requirement analysis, summary design and detail design. We propose to improve system reliability and robustness by managing data and information reuse, redundancy as well as structure.

...read moreread less

Proceedings Article•DOI•

Performing OLAP over Graph Data: Query Language, Implementation, and a Case Study

[...]

Leticia I. Gómez¹, Bart Kuijpers², Alejandro A. Vaisman¹•Institutions (2)

Instituto Tecnológico de Buenos Aires¹, University of Hasselt²

28 Aug 2017

TL;DR: This paper shows how the proposed multidimensional (MD) data model for graph analysis was implemented over the widely used Neo4J graph database, discusses implementation issues, and presents a detailed case study to show how OLAP operations can be used on graphs.

...read moreread less

Abstract: In current Big Data scenarios, traditional data warehousing and Online Analytical Processing (OLAP) operations on cubes are clearly not sufficient to address the current data analysis requirements. Nevertheless, OLAP operations and models can expand the possibilities of graph analysis beyond the traditional graph-based computation. In spite of this, there is not much work on the problem of taking OLAP analysis to the graph data model. In previous work we proposed a multidimensional (MD) data model for graph analysis, that considers not only the basic graph data, but background information in the form of dimension hierarchies as well. The graphs in our model are node- and edge-labelled directed multi-hypergraphs, called graphoids, defined at several different levels of granularity. In this paper we show how we implemented this proposal over the widely used Neo4J graph database, discuss implementation issues, and present a detailed case study to show how OLAP operations can be used on graphs.

...read moreread less

Book Chapter•DOI•

Formalising openCypher Graph Queries in Relational Algebra

[...]

József Marton¹, Gábor Szárnyas², Gábor Szárnyas¹, Dániel Varró², Dániel Varró¹ - Show less +1 more•Institutions (2)

Budapest University of Technology and Economics¹, McGill University²

24 Sep 2017

TL;DR: In this article, the authors propose a standard data model or query language for graph database systems, which are increasingly adapted for storing and processing heterogeneous network-like datasets, but due to the novelty of such systems, no standard data models or query languages have yet emerged, thus subjecting the users to the possibility of vendor lock-in.

...read moreread less

Abstract: Graph database systems are increasingly adapted for storing and processing heterogeneous network-like datasets. However, due to the novelty of such systems, no standard data model or query language has yet emerged. Consequently, migrating datasets or applications even between related technologies often requires a large amount of manual work or ad-hoc solutions, thus subjecting the users to the possibility of vendor lock-in. To avoid this threat, vendors are working on supporting existing standard languages (e.g. SQL) or standardising languages.

...read moreread less

Journal Article•DOI•

Topology Modeling and Analysis of a Power Grid Network Using a Graph Database

[...]

Bowen Kan, Wendong Zhu, Guangyi Liu, Xi Chen, Di Shi, Weiqing Yu - Show less +2 more

14 Sep 2017-International Journal of Computational Intelligence Systems

TL;DR: A new method for storing, modeling, and analyzing power grid data using the open source graph database Neo4j and the efficiency and effectiveness of topology modeling and analysis using graph database for a power grid network are introduced.

...read moreread less

Abstract: We introduce a new method for storing, modeling, and analyzing power grid data. First, we present an architecture for building the network model for a power grid using the open source graph database Neo4j. Second, we design singleand multi-threading systems for initial energization analysis of the power grid network. We design the shortest path search function and conditional search function based on Neo4j. Finally, we compare the functionality and efficiency of our graph database with a traditional relational database in system initial energization analysis and the shortest path function problems on small to large data sets. The results demonstrate the efficiency and effectiveness of topology modeling and analysis using graph database for a power grid network.

...read moreread less

Proceedings Article•DOI•

A Comparison of Parallel Graph Processing Implementations

[...]

Samuel D. Pollard¹, Boyana Norris¹•Institutions (1)

University of Oregon¹

01 Sep 2017

TL;DR: In this paper, the authors present an approach and associated software for analyzing the performance and scalability of parallel, open-source graph libraries, such as GraphMat, Graph500, Graph Algorithm Platform Benchmark Suite, GraphBIG, and PowerGraph.

...read moreread less

Abstract: The rapidly growing number of large network analysis problems has led to the emergence of many parallel and distributed graph processing systems—one survey in 2014 identified over 80 Determining the best approach for a given problem is infeasible for most developers We present an approach and associated software for analyzing the performance and scalability of parallel, open-source graph libraries We demonstrate our approach on five graph processing packages: GraphMat, Graph500, Graph Algorithm Platform Benchmark Suite, GraphBIG, and PowerGraph using synthetic and real-world datasets We examine previously overlooked aspects of parallel graph processing performance such as phases of execution and energy usage for three algorithms: breadth first search, single source shortest paths, and PageRank

...read moreread less

Posted Content•

GraphMP: An Efficient Semi-External-Memory Big Graph Processing System on a Single Machine

[...]

Peng Sun¹, Yonggang Wen¹, Ta Nguyen Binh Duong¹, Xiaokui Xiao¹•Institutions (1)

Nanyang Technological University¹

09 Jul 2017-arXiv: Distributed, Parallel, and Cluster Computing

TL;DR: GraphMP as mentioned in this paper proposes a vertex-centric sliding window (VSW) computation model to avoid reading and writing vertices on disk and a selective scheduling method to skip loading and processing unnecessary edge shards on disk.

...read moreread less

Abstract: Recent studies showed that single-machine graph processing systems can be as highly competitive as cluster-based approaches on large-scale problems. While several out-of-core graph processing systems and computation models have been proposed, the high disk I/O overhead could significantly reduce performance in many practical cases. In this paper, we propose GraphMP to tackle big graph analytics on a single machine. GraphMP achieves low disk I/O overhead with three techniques. First, we design a vertex-centric sliding window (VSW) computation model to avoid reading and writing vertices on disk. Second, we propose a selective scheduling method to skip loading and processing unnecessary edge shards on disk. Third, we use a compressed edge cache mechanism to fully utilize the available memory of a machine to reduce the amount of disk accesses for edges. Extensive evaluations have shown that GraphMP could outperform state-of-the-art systems such as GraphChi, X-Stream and GridGraph by 31.6x, 54.5x and 23.1x respectively, when running popular graph applications on a billion-vertex graph.

...read moreread less

Proceedings Article•DOI•

DH-Falcon: A Language for Large-Scale Graph Processing on Distributed Heterogeneous Systems

[...]

Unnikrishnan Cheramangalath¹, Rupesh Nasre², Y. N. Srikant¹•Institutions (2)

Indian Institute of Science¹, Indian Institute of Technology Madras²

01 Sep 2017

TL;DR: DH-Falcon is presented, a graph DSL (domain-specific language) which can be used to implement parallel algorithms for large-scale graphs, tar-geting Distributed Heterogeneous (CPU and GPU) clusters and gains a speedup of up to 13×.

...read moreread less

Abstract: Graph models of social information systems typically contain trillions of edges. Such big graphs cannot beprocessed on a single machine. The graph object must bepartitioned and distributed among machines and processedin parallel on a computer cluster. Programming such systemsis very challenging. In this work, we present DH-Falcon, a graph DSL (domain-specific language) which can be usedto implement parallel algorithms for large-scale graphs, tar-geting Distributed Heterogeneous (CPU and GPU) clusters. DH-Falcon compiler is built on top of the Falcon compiler, which targets single node devices with CPU and multipleGPUs. An important facility provided by DH-Falcon is that itsupports mutation of graph objects, which allows programmerto write dynamic graph algorithms. Experimental evaluationshows that DH-Falcon matches or outperforms state-of-the-art frameworks and gains a speedup of up to 13×.

...read moreread less

Patent•

Loop and library fusion

[...]

Eli Bendersky¹, Robert Hundt¹, Mark Heffernan¹, Jingyue Wu¹•Institutions (1)

Google¹

06 Jan 2017

TL;DR: In this article, an un-optimized computational graph is analyzed using pattern matching to determine fusible operations that can be fused together into a single fusion operation, and the fusion node is translated as a call that performs the fused operations.

...read moreread less

Abstract: Method comprising obtaining 202 un-optimised computational graph comprising a plurality of nodes representing operations and directed edges representing data dependencies; analysing 204 the un-optimised graph using pattern matching to determine fusable operations that can be fused together into a single fusion operation; transforming 206 the un-optimised graph into an optimised computational graph by replacing the nodes representing the fusible operations in the un-optimised graph with a fusion node representing the single fusion operation; and providing to a compiler the fusion node that the compiler can translate as a call that performs the fused operations to produce 208 efficient code. It may also provide the efficient code to computing devices for execution. Execution may include executing the operations of the computational graph including the single fusion call that performs all fused operations. Analysing using pattern matching may involve comparing portions of the un-optimised graph with patterns of operations that each correspond to a single fusion operation; determining that a pattern matches a portion of the un-optimised graph; and determining that the matching portion of the un-optimised graph can be replaced in the computational graph with the single fusion operation corresponding to the matching pattern.

...read moreread less

Journal Article•DOI•

Module-based visualization of large-scale graph network data

[...]

Chenhui Li¹, George Baciu¹, Yunzhe Wang¹•Institutions (1)

Hong Kong Polytechnic University¹

01 May 2017-Journal of Visualization

TL;DR: This paper presents a large-graph visualization system called ModuleGraph, a scalable representation of graph structures by treating a graph as a set of modules that can efficiently support large-scale social and spatial network visualization.

...read moreread less

Abstract: The efficient visualization of dynamic network structures has become a dominant problem in many big data applications, such as large network analytics, traffic management, resource allocation graphs, logistics, social networks, and large document repositories. In this paper, we present a large-graph visualization system called ModuleGraph. ModuleGraph is a scalable representation of graph structures by treating a graph as a set of modules. The main objectives are: (1) to detect graph patterns in the visualization of large-graph data, and (2) to emphasize the interconnecting structures to detect potential interactions between local modules. Our first contribution is a hybrid modularity measure. This measure partitions the cohesion of the graph at various levels of details. We aggregate clusters of nodes and edges into several modules to reduce the overlap between graph components on a 2D display. Our second contribution is a k-clustering method that can flexibly detect the local patterns or substructures in modules. Patterns of modules are preserved by the ModuleGraph system to avoid information loss, while sub-graphs are clustered as a single node. Our experiments show that this method can efficiently support large-scale social and spatial network visualization. Graphical Abstract text

...read moreread less

Book Chapter•DOI•

Directed Acyclic Graph Scheduling for Mixed-Criticality Systems

[...]

Roberto Medina¹, Etienne Borde¹, Laurent Pautet¹•Institutions (1)

Université Paris-Saclay¹

12 Jun 2017

TL;DR: A multi-core scheduling approach for a model presenting Mixed-criticality tasks and their dependencies as a Directed Acyclic Graph (DAG) is proposed and an evaluation framework is introduced, released as an open source software.

...read moreread less

Abstract: Deploying safety-critical systems into constrained embedded platforms is a challenge for developers who must arbitrate between two conflicting objectives: software has to be safe and resources need to be used efficiently. Mixed-criticality (MC) has been proposed to meet a trade-off between these two aspects. Nonetheless, most task models considered in the literature of MC scheduling, do not take into account precedence constraints among tasks. In this paper, we propose a multi-core scheduling approach for a model presenting MC tasks and their dependencies as a Directed Acyclic Graph (DAG). We also introduce an evaluation framework for this model, released as an open source software. Evaluation of our scheduling algorithm provides evidence of the difficulty to find correct scheduling for DAGs of MC tasks. Besides, experimentation results provided in this paper show that our scheduling algorithm outperforms existing algorithms for scheduling DAGs of MC tasks.

...read moreread less

Journal Article•DOI•

Evaluation of Optimization Strategies for Incremental Graph Queries

[...]

Gábor Szárnyas¹, János Maginecz, Dániel Varró¹•Institutions (1)

Budapest University of Technology and Economics¹

23 May 2017

TL;DR: This paper presents optimization techniques used in relational database systems and applies them on graph queries and evaluates various query plans on multiple datasets and discusses the effect of different optimization techniques.

...read moreread less

Abstract: The last decade brought considerable improvements in dis - tributed storage and query technologies, known as NoSQL systems These systems provide quick evaluation of simple retrieval operations and are able to answer certain complex queries in a scalable way, albeit not instantly Providing scal - ability and quick response times at the same time for querying large data sets is still a challenging task Evaluating com - plex graph queries is particularly difficult, as it requires lots of join, antijoin and filtering operations This paper presents optimization techniques used in relational database systems and applies them on graph queries We evaluate various query plans on multiple datasets and discuss the effect of different optimization techniques

...read moreread less

Proceedings Article•DOI•

Graph Exploration: From Users to Large Graphs

[...]

Davide Mottin¹, Emmanuel Müller¹•Institutions (1)

Hasso Plattner Institute¹

09 May 2017

TL;DR: This tutorial will provide a generalized definition of graph exploration in which the user interacts directly with the system either providing feedback or a partial query, and discuss common, diverse, and missing properties ofgraph exploration techniques based on this definition, the authors' taxonomy, and multiple applications for graph exploration.

...read moreread less

Abstract: The increasing interest in social networks, knowledge graphs, protein-interaction, and many other types of networks has raised the question how users can explore such large and complex graph structures easily. Current tools focus on graph management, graph mining, or graph visualization but lack user-driven methods for graph exploration. In many cases graph methods try to scale to the size and complexity of a real network. However, methods miss user requirements such as exploratory graph query processing, intuitive graph explanation, and interactivity in graph exploration. While there is consensus in database and data mining communities on the definition of data exploration practices for relational and semi-structured data, graph exploration practices are still indeterminate. In this tutorial, we will discuss a set of techniques, which have been developed in the last few years for independent purposes, within a unified graph exploration taxonomy. The tutorial will provide a generalized definition of graph exploration in which the user interacts directly with the system either providing feedback or a partial query. We will discuss common, diverse, and missing properties of graph exploration techniques based on this definition, our taxonomy, and multiple applications for graph exploration. Concluding this discussion we will highlight interesting and relevant challenges for data scientists in graph exploration.

...read moreread less

Journal Article•DOI•

Graph Data: The Next Frontier in Big Data Modeling for Various Domains

[...]

Angira Amit Patel, Jyotindra Dharwa¹•Institutions (1)

Ganpat University¹

20 Jun 2017-Indian journal of science and technology

TL;DR: This research illustrates potential of graph structure for diversified modeling along with its convenience for various domains and provides guidelines for solving data modeling challenges for structure, semi structure and unstructured data.

...read moreread less

Abstract: Graph is considered as next frontier in the era of Big data due to its flexibility and self-explaining property. The prime objective of this research is to reveal graph database as an alternative of traditional relation database in the field of database. This research illustrates potential of graph structure for diversified modeling along with its convenience for various domains. Extensive literature review demonstrates use of diversified graph structures as means of data storage and analysis as it can cope up any kind of complex structures ranging from multi linked web data, complex chemical structure, gene data, network structure, social network, e-commerce to text data. A formal conclusion of this review revealed use of various graph models according to state-of-affairs of various domains as well as data modeling challenges and complexity of Big data. This investigation anticipates use of an appropriate graph structure and provides guidelines for solving data modeling challenges for structure, semi structure and unstructured data. The diversified graph structure along with its characteristics has been suggested for real world problems of various domains. This research could lend a helping hand to anyone who wants to implement graph data model for their data management challenges and computational problems.

...read moreread less

Proceedings Article•DOI•

Automatic verification of application-tailored OSEK kernels

[...]

Hans-Peter Deifel¹, Merlin Göttlinger¹, Stefan Milius¹, Lutz Schröder¹, Christian Dietrich², Daniel Lohmann² - Show less +2 more•Institutions (2)

University of Erlangen-Nuremberg¹, Leibniz University of Hanover²

08 Aug 2017

TL;DR: This work reports on efforts to develop verification methods for OSEK-conformant compilers, specifically of a code generator that weaves system calls and application code using a static configuration file, producing a stand-alone application that incorporates the relevant parts of the kernel.

...read moreread less

Abstract: The OSEK industrial standard governs the design of embedded real-time operating systems in the automotive domain. We report on efforts to develop verification methods for OSEK-conformant compilers, specifically of a code generator that weaves system calls and application code using a static configuration file, producing a stand-alone application that incorporates the relevant parts of the kernel. Our methodology involves two verification steps: On the one hand, we extract an OS-application interaction graph during the compilation phase and verify that it conforms to the standard, in particular regarding prioritized scheduling and interrupt handling. To this end, we generate from the configuration file a temporal specification of standard-conformant behaviour and model check the arising formulas on a labelled transition system extracted from the interaction graph. On the other hand, we verify that the actual generated code conforms to the interaction graph; this is done by graph isomorphism checking of the interaction graph against a dynamically-explored state-transition graph of the generated system.

...read moreread less

Proceedings Article•DOI•

GeaBase: A High-Performance Distributed Graph Database for Industry-Scale Applications

[...]

Zhisong Fu, Zhengwei Wu, Houyi Li, Yize Li, Min Wu, Xiaojie Chen, Xiaomeng Ye, Benquan Yu, Xi Hu - Show less +5 more

01 Aug 2017

TL;DR: GeaBase is a new distributed graph database that provides the capability to store and analyze graph-structured data in real-time at massive scale, including a novel update architecture, called Update Center (UC), and a new language that is suitable for both graph traversal and analytics.

...read moreread less

Abstract: Graph analytics have been gaining tractions rapidly in the past few years. It has a wide array of application areas in the industry, ranging from e-commerce, social network and recommendation systems to fraud detection and virtually any problem that requires insights into data connections, not just data itself. In this paper, we present GeaBase, a new distributed graph database that provides the capability to store and analyze graph-structured data in real-time at massive scale. We describe the details of the system and the implementation, including a novel update architecture, called Update Center (UC), and a new language that is suitable for both graph traversal and analytics. We also compare the performance of GeaBase to a widely used open-source graph database Titan. Experiments show that GeaBase is up to 182x faster than Titan in our testing scenarios. We also achieves 22x higher throughput on social network workloads in the comparison.

...read moreread less

Journal Article•DOI•

Exploiting NoSQL Graph Databases and In Memory Architectures for Extracting Graph Structural Data Summaries

[...]

Arnaud Castelltort, Anne Laurent

21 Feb 2017-International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems

TL;DR: This paper provides the definitions of the summaries with the methods to automatically extract them from NoSQL graph databases only and with the help of in-memory architectures and demonstrates the benefit of the proposition by experimental results.

...read moreread less

Abstract: NoSQL graph databases have been introduced in recent years for dealing with large collections of graph-based data. Scientific data and social networks are among the best examples of the dramatic increase of the use of such structures. NoSQL repositories allow the management of large amounts of data in order to store and query them. Such data are not structured with a predefined schema as relational databases could be. They are rather composed by nodes and relationships of a certain type. For instance, a node can represent a Person and a relationship Friendship. Retrieving the structure of the graph database is thus of great help to users, for example when they must know how to query the data or to identify relevant data sources for recommender systems. For this reason, this paper introduces methods to retrieve structural summaries. Such structural summaries are extracted at different levels of information from the NoSQL graph database. The expression of the mining queries is facilitated by the use of two frame-works: Fuzzy4S allowing to define fuzzy operators and operations with Scala; Cypherf allowing the use of fuzzy operators and operations in the declarative queries over NoSQL graph databases. We show that extracting such summaries can be impossible with the NoSQL query engines because of the data volume and the complexity of the task of automatic knowledge extraction. A novel method based on in memory architectures is thus introduced. This paper provides the definitions of the summaries with the methods to automatically extract them from NoSQL graph databases only and with the help of in-memory architectures. The benefit of our proposition is demonstrated by experimental results.

...read moreread less