Showing papers on "Wait-for graph published in 2015"

PDF

Open Access

Journal Article•DOI•

Thinking Like a Vertex: A Survey of Vertex-Centric Frameworks for Large-Scale Distributed Graph Processing

[...]

Robert Ryan McCune¹, Tim Weninger¹, Greg Madey¹•Institutions (1)

12 Oct 2015-ACM Computing Surveys

TL;DR: In this survey, the vertex-centric approach to graph processing is overviewed, TLAV frameworks are deconstructed into four main components and respectively analyzed, and TLAV implementations are reviewed and categorized.

...read moreread less

Abstract: The vertex-centric programming model is an established computational paradigm recently incorporated into distributed processing frameworks to address challenges in large-scale graph processing. Billion-node graphs that exceed the memory capacity of commodity machines are not well supported by popular Big Data tools like MapReduce, which are notoriously poor performing for iterative graph algorithms such as PageRank. In response, a new type of framework challenges one to “think like a vertex” (TLAV) and implements user-defined programs from the perspective of a vertex rather than a graph. Such an approach improves locality, demonstrates linear scalability, and provides a natural way to express and compute many iterative graph algorithms. These frameworks are simple to program and widely applicable but, like an operating system, are composed of several intricate, interdependent components, of which a thorough understanding is necessary in order to elicit top performance at scale. To this end, the first comprehensive survey of TLAV frameworks is presented. In this survey, the vertex-centric approach to graph processing is overviewed, TLAV frameworks are deconstructed into four main components and respectively analyzed, and TLAV implementations are reviewed and categorized.

...read moreread less

267 citations

Book•

Graph Databases: New Opportunities for Connected Data

[...]

Ian Scott Robinson, Jim Webber, Emil Eifrem

10 Jun 2015

TL;DR: This second edition of this practical book includes new code samples and diagrams, using the latest Neo4j syntax, as well as information on new functionality.

...read moreread less

Abstract: Discover how graph databases can help you manage and query highly connected data. With this practical book, youll learn how to design and implement a graph database that brings the power of graphs to bear on a broad range of problem domains. Whether you want to speed up your response to user queries or build a database that can adapt as your business evolves, this book shows you how to apply the schema-free graph model to real-world problems.This second edition includes new code samples and diagrams, using the latest Neo4j syntax, as well as information on new functionality. Learn how different organizations are using graph databases to outperform their competitors. With this books data modeling, query, and code examples, youll quickly be able to implement your own solution.Model data with the Cypher query language and property graph modelLearn best practices and common pitfalls when modeling with graphsPlan and implement a graph database solution in test-driven fashionExplore real-world examples to learn how and why organizations use a graph databaseUnderstand common patterns and components of graph database architectureUse analytical techniques and algorithms to mine graph database information

...read moreread less

184 citations

Proceedings Article•DOI•

Locality Exists in Graph Processing: Workload Characterization on an Ivy Bridge Server

[...]

Scott Beamer¹, Krste Asanovic¹, David A. Patterson¹•Institutions (1)

University of California, Berkeley¹

04 Oct 2015

TL;DR: There is substantial room for a different processor architecture to improve performance without requiring a new memory system in high-performance graph algorithm codebases using hardware performance counters on a conventional dual-socket server.

...read moreread less

Abstract: Graph processing is an increasingly important application domain and is typically communication-bound. In this work, we analyze the performance characteristics of three high-performance graph algorithm codebases using hardware performance counters on a conventional dual-socket server. Unlike many other communication-bound workloads, graph algorithms struggle to fully utilize the platform's memory bandwidth and so increasing memory bandwidth utilization could be just as effective as decreasing communication. Based on our observations of simultaneous low compute and bandwidth utilization, we find there is substantial room for a different processor architecture to improve performance without requiring a new memory system.

...read moreread less

164 citations

Proceedings Article•DOI•

GraphBIG: understanding graph computing in the context of industrial solutions

[...]

Lifeng Nai¹, Yinglong Xia², Ilie Gabriel Tanase², Hyesoon Kim¹, Ching-Yung Lin¹ - Show less +1 more•Institutions (2)

Georgia Institute of Technology¹, IBM²

15 Nov 2015

TL;DR: This paper characterized GraphBIG on real machines and observed extremely irregular memory patterns and significant diverse behavior across different computations, helping users understand the impact of modern graph computing on the hardware architecture and enables future architecture and system research.

...read moreread less

Abstract: With the emergence of data science, graph computing is becoming a crucial tool for processing big connected data. Although efficient implementations of specific graph applications exist, the behavior of full-spectrum graph computing remains unknown. To understand graph computing, we must consider multiple graph computation types, graph frameworks, data representations, and various data sources in a holistic way. In this paper, we present GraphBIG, a benchmark suite inspired by IBM System G project. To cover major graph computation types and data sources, GraphBIG selects representative datastructures, workloads and data sets from 21 real-world use cases of multiple application domains. We characterized GraphBIG on real machines and observed extremely irregular memory patterns and significant diverse behavior across different computations. GraphBIG helps users understand the impact of modern graph computing on the hardware architecture and enables future architecture and system research.

...read moreread less

141 citations

Proceedings Article•DOI•

PGX.D: a fast distributed graph processing engine

[...]

Sungpack Hong¹, Siegfried Depner¹, Thomas Manhardt¹, Jan Van Der Lugt², Merijn Verstraaten³, Hassan Chafi¹ - Show less +2 more•Institutions (3)

Oracle Corporation¹, Google², University of Amsterdam³

15 Nov 2015

TL;DR: This paper presents a fast distributed graph processing system, namely PGX.D, as a low-overhead, bandwidth-efficient communication framework that supports remote data-pulling patterns and recommends the use of balanced beefy clusters where the sustained random DRAM-access bandwidth in aggregate is matched with the bandwidth of the underlying interconnection fabric.

...read moreread less

Abstract: Graph analysis is a powerful method in data analysis. Although several frameworks have been proposed for processing large graph instances in distributed environments, their performance is much lower than using efficient single-machine implementations provided with enough memory. In this paper, we present a fast distributed graph processing system, namely PGX.D. We show that PGX.D outperforms other distributed graph systems like GraphLab significantly (3x -- 90x). Furthermore, PGX.D on 4 to 16 machines is also faster than an implementation optimized for single-machine execution. Using a fast cooperative context-switching mechanism, we implement PGX.D as a low-overhead, bandwidth-efficient communication framework that supports remote data-pulling patterns. Moreover, PGX.D achieves large traffic reduction and good workload balance by applying selective ghost nodes, edge partitioning, and edge chunking transparently to the user. Our analysis confirms that each of these features is indeed crucial for overall performance of certain kinds of graph algorithms. Finally, we advocate the use of balanced beefy clusters where the sustained random DRAM-access bandwidth in aggregate is matched with the bandwidth of the underlying interconnection fabric.

...read moreread less

87 citations

Proceedings Article•DOI•

A task-based linear algebra Building Blocks approach for scalable graph analytics

[...]

Michael M. Wolf¹, Jonathan W. Berry¹, Dylan Stark¹•Institutions (1)

Sandia National Laboratories¹

12 Nov 2015

TL;DR: This work outlines a Graph BLAS-like, linear algebra based approach to miniTri, one such miniapp, and describes a task-based prototype implementation and gives initial scalability results.

...read moreread less

Abstract: It is challenging to obtain scalable HPC performance on real applications, especially for data science applications with irregular memory access and computation patterns. To drive co-design efforts in architecture, system, and application design, we are developing miniapps representative of data science workloads. These in turn stress the state of the art in Graph BLAS-like Graph Algorithm Building Blocks (GABB). In this work, we outline a Graph BLAS-like, linear algebra based approach to miniTri, one such miniapp. We describe a task-based prototype implementation and give initial scalability results.

...read moreread less

38 citations

Journal Article•DOI•

GraphGen: exploring interesting graphs in relational data

[...]

Konstantinos Xirogiannopoulos¹, Udayan Khurana¹, Amol Deshpande¹•Institutions (1)

University of Maryland, College Park¹

01 Aug 2015

TL;DR: This work demonstrates a system, called GraphGen, that enables users to declaratively specify graph extraction tasks over relational databases, visually explore the extracted graphs, and write and execute graph algorithms over them, either directly or using existing graph libraries like the widely used NetworkX Python library.

...read moreread less

Abstract: Analyzing interconnection structures among the data through the use of graph algorithms and graph analytics has been shown to provide tremendous value in many application domains. However, graphs are not the primary choice for how most data is currently stored, and users who want to employ graph analytics are forced to extract data from their data stores, construct the requisite graphs, and then use a specialized engine to write and execute their graph analysis tasks. This cumbersome and costly process not only raises barriers in using graph analytics, but also makes it hard to explore and identify hidden or implicit graphs in the data. Here we demonstrate a system, called GraphGen, that enables users to declaratively specify graph extraction tasks over relational databases, visually explore the extracted graphs, and write and execute graph algorithms over them, either directly or using existing graph libraries like the widely used NetworkX Python library. We also demonstrate how unifying the extraction tasks and the graph algorithms enables significant optimizations that would not be possible otherwise.

...read moreread less

38 citations

Proceedings Article•DOI•

Graph analytics using vertica relational database

[...]

Alekh Jindal¹, Samuel Madden¹, Malu Castellanos², Meichun Hsu²•Institutions (2)

Massachusetts Institute of Technology¹, HP Software Division²

29 Oct 2015

TL;DR: It is shown that vertex-centric graph analysis can be translated to SQL queries, typically involving table scans and joins, and that modern column-oriented databases are very well suited to running such queries.

...read moreread less

Abstract: Graph analytics is becoming increasingly popular, with a number of new applications and systems developed in the past few years In this paper, we study Vertica relational database as a platform for graph analytics We show that vertex-centric graph analysis can be translated to SQL queries, typically involving table scans and joins, and that modern column-oriented databases are very well suited to running such queries Furthermore, we show how developers can trade memory footprint for significantly reduced I/O costs in Vertica We present an experimental evaluation of the Vertica relational database system on a variety of graph analytics, including iterative analysis, a combination of graph and relational analyses, and more complex 1-hop neighborhood graph analytics, showing that it is competitive to two popular vertex-centric graph analytics systems, namely Giraph and GraphLab

...read moreread less

33 citations

Proceedings Article•DOI•

GRAPHITE: an extensible graph traversal framework for relational database management systems

[...]

Marcus Paradies¹, Wolfgang Lehner¹, Christof Bornhövd•Institutions (1)

Dresden University of Technology¹

29 Jun 2015

TL;DR: In this article, an extensible graph traversal framework (GRAPHITE) is proposed as a central graph processing component on a common storage engine inside a relational database management system, and two traversal algorithm implementations are derived for different graph topologies and traversal queries.

...read moreread less

Abstract: Graph traversals are a basic but fundamental ingredient for a variety of graph algorithms and graph-oriented queries. To achieve the best possible query performance, they need to be implemented at the core of a database management system that aims at storing, manipulating, and querying graph data. Increasingly, modern business applications demand native graph query and processing capabilities for enterprise-critical operations on data stored in relational database management systems. In this paper we propose an extensible graph traversal framework (GRAPHITE) as a central graph processing component on a common storage engine inside a relational database management system.We study the influence of the graph topology on the execution time of graph traversals and derive two traversal algorithm implementations specialized for different graph topologies and traversal queries. We conduct extensive experiments on GRAPHITE for a large variety of real-world graph data sets and input configurations. Our experiments show that the proposed traversal algorithms differ by up to two orders of magnitude for different input configurations and therefore demonstrate the need for a versatile framework to efficiently process graph traversals on a wide range of different graph topologies and types of queries. Finally, we highlight that the query performance of our traversal implementations is competitive with those of two native graph database management systems.

...read moreread less

31 citations

Proceedings Article•DOI•

Wait-Free Concurrent Graph Objects with Dynamic Traversals.

[...]

Nikolaos D. Kallimanis¹, Eleni Kanellou²•Institutions (2)

University of Ioannina¹, University of Rennes²

01 Jan 2015

TL;DR: This paper introduces an innovative concurrent graph model that provides addition and removal of any edge of the graph, as well as atomic traversals of a part (or the entirety), and presents Dense, a concurrent graph implementation that aims at mitigating the two aforementioned implementation drawbacks.

...read moreread less

Abstract: Graphs are versatile data structures that allow the implementation of a variety of applications, such as computer-aided design and manufacturing, video gaming, or scientific simulations. However, although data structures such as queues, stacks, and trees have been widely studied and implemented in the concurrent context, multi-process applications that rely on graphs still largely use a sequential implementation where accesses are synchronized through the use of global locks or partitioning, thus imposing serious performance bottlenecks. In this paper we introduce an innovative concurrent graph model that provides addition and removal of any edge of the graph, as well as atomic traversals of a part (or the entirety) of the graph. We further present Dense, a concurrent graph implementation that aims at mitigating the two aforementioned implementation drawbacks. Dense achieves wait-freedom by relying on helping and provides the inbuilt capability of performing a partial snapshot on a dynamically determined subset of the graph.

...read moreread less

30 citations

Proceedings Article•DOI•

Graph databases: A survey

[...]

Rohit Kumar Kaliyar¹•Institutions (1)

Shiv Nadar University¹

15 May 2015

TL;DR: An overview of the different type of graph databases, applications, and comparison between their models based on some properties is given.

...read moreread less

Abstract: In the era of big data, data analytics, business intelligence database management plays a vital role from technical business management and research point of view. Over many decades, database management has been a topic of active research. There are different type of database management system have been proposed over a period of time but Relational Database Management System (RDBMS) is the one which has been most popularly used in academic research as well as industrial setup[1]. In recent years, graph databases regained interest among the researchers for certain obvious reasons. One of the most important reasons for such an interest in a graph database is because of the inherent property of graphs as a graph structure. Graphs are present everywhere in the data structure, which represents the strong connectivity within the data. Most of the graph database models are defined in which data-structure for schema and instances are modeled as graph or generalization of a graph. In such graph database models, data manipulations are expressed by graph-oriented operations and type constructors [9]. Now days, most of the real world applications can be modeled as a graph and one of the best real world examples is social or biological network. This paper gives an overview of the different type of graph databases, applications, and comparison between their models based on some properties.

...read moreread less

Journal Article•DOI•

Synchronization in Directed Complex Networks Using Graph Comparison Tools

[...]

Hui Liu¹, Ming Cao², Chai Wah Wu³, Jun-an Lu⁴, Chi K. Tse⁵ - Show less +1 more•Institutions (5)

Huazhong University of Science and Technology¹, University of Groningen², IBM³, Wuhan University⁴, Hong Kong Polytechnic University⁵

25 Mar 2015-IEEE Transactions on Circuits and Systems

TL;DR: Lower bounds for the coupling strengths of oscillators in directed networks to guarantee global synchronization are proposed and examples are provided to demonstrate how to apply the proposed methodologies to typical directed complex networks.

...read moreread less

Abstract: This paper proposes lower bounds for the coupling strengths of oscillators in directed networks to guarantee global synchronization. The novel idea of graph comparison from spectral graph theory is employed so that the combinatorial features of a given network can be fully utilized to simplify computations. For large networks that can be decomposed into a set of smaller strongly connected components, the comparison can be carried out at the local level as well. To validate theoretical analysis, examples are provided to demonstrate how to apply the proposed methodologies to typical directed complex networks.

...read moreread less

Proceedings Article•DOI•

Performance Characterization of High-Level Programming Models for GPU Graph Analytics

[...]

Yuduo Wu¹, Yangzihao Wang¹, Yuechao Pan¹, Carl Yang¹, John D. Owens¹ - Show less +1 more•Institutions (1)

University of California, Davis¹

04 Oct 2015

TL;DR: It is shown that efficient building block operators enable more powerful operations for fast information propagation and result in fewer device kernel invocations, less data movement, and fewer global synchronizations, and thus are key focus areas for efficient large-scale graph analytics on the GPU.

...read moreread less

Abstract: We identify several factors that are critical to high-performance GPU graph analytics: efficient building block operators, synchronization and data movement, workload distribution and load balancing, and memory access patterns. We analyze the impact of these critical factors through three GPU graph analytic frameworks, Gun rock, Map Graph, and VertexAPI2. We also examine their effect on different workloads: four common graph primitives from multiple graph application domains, evaluated through real-world and synthetic graphs. We show that efficient building block operators enable more powerful operations for fast information propagation and result in fewer device kernel invocations, less data movement, and fewer global synchronizations, and thus are key focus areas for efficient large-scale graph analytics on the GPU.

...read moreread less

Proceedings Article•DOI•

Dynamic deadlock verification for general barrier synchronisation

[...]

Tiago Cogumbreiro¹, Raymond Hu¹, Francisco Martins², Nobuko Yoshida¹•Institutions (2)

Imperial College London¹, University of Lisbon²

24 Jan 2015

TL;DR: A core language expressive enough to represent the three most widespread barrier synchronisation patterns: group, split-phase, and dynamic membership is introduced, and a graph analysis technique that selects from two alternative graph representations is proposed, which prove that finding a deadlock in either representation is equivalent.

...read moreread less

Abstract: We present Armus, a dynamic verification tool for deadlock detection and avoidance specialised in barrier synchronisation Barriers are used to coordinate the execution of groups of tasks, and serve as a building block of parallel computing Our tool verifies more barrier synchronisation patterns than current state-of-the-art To improve the scalability of verification, we introduce a novel event-based representation of concurrency constraints, and a graph-based technique for deadlock analysis The implementation is distributed and fault-tolerant, and can verify X10 and Java programs To formalise the notion of barrier deadlock, we introduce a core language expressive enough to represent the three most widespread barrier synchronisation patterns: group, split-phase, and dynamic membership We propose a graph analysis technique that selects from two alternative graph representations: the Wait-For Graph, that favours programs with more tasks than barriers; and the State Graph, optimised for programs with more barriers than tasks We prove that finding a deadlock in either representation is equivalent, and that the verification algorithm is sound and complete with respect to the notion of deadlock in our core language Armus is evaluated with three benchmark suites in local and distributed scenarios The benchmarks show that graph analysis with automatic graph-representation selection can record a 7-fold execution increase versus the traditional fixed graph representation The performance measurements for distributed deadlock detection between 64 processes show negligible overheads

...read moreread less

Proceedings Article•DOI•

GRAPH/Z: A Key-Value Store Based Scalable Graph Processing System

[...]

Tonglin Li¹, Chaoqi Ma, Jiabao Li, Xiaobing Zhou¹, Ke Wang¹, Dongfang Zhao¹, Iman Sadooghi¹, Ioan Raicu¹ - Show less +4 more•Institutions (1)

Illinois Institute of Technology¹

08 Sep 2015

TL;DR: This work describes the design and implementation of a new graph processing system based on Bulk Synchronous Parallel model, built on top of ZHT, a scalable distributed key-value store, which benefits the graph processing in terms of scalability, performance and persistency.

...read moreread less

Abstract: The emerging applications in big data and social networks issue rapidly increasing demands on graph processing. Graph query operations that involve a large number of vertices and edges can be tremendously slow on traditional databases. The state-of-the-art graph processing systems and databases usually adopt master/slave architecture that potentially impairs their The contributions of this paper are as follows: scalability. This work describes the design and implementation of a new graph processing system based on Bulk Synchronous Parallel model. Our system is built on top of ZHT, a scalable distributed key-value store, which benefits the graph processing in terms of scalability, performance and persistency. The experiment results imply excellent scalability.

...read moreread less

Proceedings Article•DOI•

Mining And-Or Graphs for Graph Matching and Object Discovery

[...]

Quanshi Zhang¹, Ying Nian Wu¹, Song-Chun Zhu¹•Institutions (1)

University of California, Los Angeles¹

07 Dec 2015

TL;DR: This method provides a general solution to the problem of mining hierarchical models from unannotated visual data without exhaustive search of objects and is applied to RGB/RGB-D images and videos to demonstrate its generality and the wide range of applicability.

...read moreread less

Abstract: This paper reformulates the theory of graph mining on the technical basis of graph matching, and extends its scope of applications to computer vision. Given a set of attributed relational graphs (ARGs), we propose to use a hierarchical And-Or Graph (AoG) to model the pattern of maximal-size common subgraphs embedded in the ARGs, and we develop a general method to mine the AoG model from the unlabeled ARGs. This method provides a general solution to the problem of mining hierarchical models from unannotated visual data without exhaustive search of objects. We apply our method to RGB/RGB-D images and videos to demonstrate its generality and the wide range of applicability. The code will be available at https://sites.google.com/site/quanshizhang/mining-and-or-graphs.

...read moreread less

Posted Content•

EmptyHeaded: Boolean Algebra Based Graph Processing

[...]

Christopher R. Aberger, Andres Nötzli, Kunle Olukotun, Christopher Ré

09 Mar 2015-arXiv: Databases

TL;DR: A graph pattern engine, EmptyHeaded, is presented that uses recent algorithmic advances in join processing to compile patterns into Boolean algebra operations that exploit SIMD parallelism and outperforms specialized graph engines by over an order of magnitude and relational systems by over two orders of magnitude.

...read moreread less

Abstract: We present a graph pattern engine, EmptyHeaded, that uses recent algorithmic advances in join processing to compile patterns into Boolean algebra operations that exploit SIMD parallelism. The EmptyHeaded engine demonstrates that treating graph patterns as a general join processing problem can compete with and often outperform both specialized approaches and existing OLAP systems on graph queries. The core Boolean algebra operation performed in EmptyHeaded is set intersection. Extracting SIMD parallelism during set intersections on graph data is challenging because graph data can be skewed in several dierent ways. Our contributions are a demonstration of this new type of engine with Boolean algebra at its core, an exploration of set intersection representations and algorithms for set intersections that are optimized for skew. We demonstrate that EmptyHeaded outperforms specialized graph engines by over an order of magnitude and relational systems by over two orders of magnitude. Our results suggest that this new style of engine is a promising new direction for future graph engines and accelerators.

...read moreread less

Book Chapter•DOI•

Graph Modification Problems: A Modern Perspective

[...]

Fedor V. Fomin¹, Saket Saurabh¹, Neeldhara Misra²•Institutions (2)

University of Bergen¹, Indian Institute of Science²

03 Jul 2015

TL;DR: An overview of recent results and techniques in parameterized algorithms for graph modification problems is given.

...read moreread less

Abstract: We give an overview of recent results and techniques in parameterized algorithms for graph modification problems.

...read moreread less

Proceedings Article•DOI•

Table2Graph: A Scalable Graph Construction from Relational Tables Using Map-Reduce

[...]

Sangkeun Lee¹, Byung H. Park¹, Seung-Hwan Lim¹, Mallikarjun Shankar¹•Institutions (1)

Oak Ridge National Laboratory¹

30 Mar 2015

TL;DR: This paper introduces Table2Graph, the graph construction tool based on Map-Reduce framework over Hadoop that turns multiple disparate data sources into a single heterogeneous graph model so that matching between entities across different source data would be expedited by examining their linkages in the graph.

...read moreread less

Abstract: Identifying correlations and relationships between entities within and across different data sets (or databases) is of great importance in many domains. The data warehouse-based integration, which has been most widely practiced, is found to be inadequate to achieve such a goal. Instead we explored an alternate solution that turns multiple disparate data sources into a single heterogeneous graph model so that matching between entities across different source data would be expedited by examining their linkages in the graph. We found, however, while a graph-based model provides outstanding capabilities for this purposes, construction of one such model from relational source databases were time consuming and primarily left to ad hoc proprietary scripts. This led us to develop a reconfigurable and reusable graph construction tool that is designed to work at scale. In this paper, we introduce Table2Graph, the graph construction tool based on Map-Reduce framework over Hadoop. We also discuss results from applying Table2Graph to integrate disparate healthcare databases.

...read moreread less

Patent•

Executable graph framework for the management of complex systems

[...]

Caryl Erin Johnson, Kay E. Aikin, David S. Phelps, Alexander Johnson, Nathaniel Rindlaub - Show less +1 more

09 Apr 2015

TL;DR: In this paper, the authors propose a homoiconic or executable graph framework for representing complex heterogeneous characteristics of processes, systems, and systems of systems that feature many to many interrelationships.

...read moreread less

Abstract: The present invention addresses deficiencies of the art with respect to collaborative computer networks consisting of mixed data, control functions, analysis functions, and sensors in complex systems of systems. The method involves a database framework for representing complex heterogeneous characteristics of processes, systems, and systems of systems that feature many to many interrelationships. The homoiconic graph framework takes the form of an executable graph database, which is often faster for associative data sets, and maps more directly to object-oriented computer applications for large-scale operations. The invention provides a method to execute the graph database, in that it comprises nodes that are both data fragments and executable components. It is characterized as a homoiconic or executable graph framework to distinguish this unique feature from the concept of a graph database, which generally is a repository of connected data only.

...read moreread less

Evaluation of graph databases performance through indexing techniques

[...]

Steve Ataky Tsham Mpinda, L.C. Ferreira, Marcela Xavier Ribeiro, Marilde Terezinha, Prado Santos - Show less +1 more

01 Jan 2015

TL;DR: The aim of this paper is to evaluate, through indexing techniques, the performance of Neo4j and OrientDB, both graph databases technologies and to come up with strength and weaknesses os each technology as a candidate for a storage mechanism of a graph structure.

...read moreread less

Abstract: The aim of this paper is to evaluate , through indexing techniques, the performance of Neo4j and OrientDB, both graph databases technologies and to come up with strength and weaknesses os each technology as a candidate for a storage mechanism of a graph structure. An index is a data structure that makes the searching faster for a specific node in concern of graph databases. The referred data structure is habitually a B-tree, however, can be a hash table or some other logic structure as well. The pivotal point of having an index is to speed up search queries, primarily by reducing the number of nodes in a graph or table to be examined. Graphs and graph databases are more commonly associated with social networking or “graph search” style recommendations. Thus, these technologies remarkably are a core technology platform for some Internet giants like Hi5, Facebook, Google, Badoo, Twitter and LinkedIn. The key to understanding graph database systems, in the social networking context, is they give equal prominence to storing both the data (users, favorites) and the relationships between them (who liked what, who ‘follows’ whom, which post was liked the most, what is the shortest path to ‘reach’ who). By a suitable application case study, in case a Twitter social networking of almost 5,000 nodes imported in local servers (Neo4j and Orient-DB), one queried to retrieval the node with the searched data, first without index (full scan), and second with index, aiming at comparing the response time (statement query time) of the aforementioned graph databases and find out which of them has a better performance (the speed of data or information retrieval) and in which case. Thereof, the main results are presented in the section 6.

...read moreread less

Proceedings Article•DOI•

Higher order graph centrality measures for Neo4j

[...]

Georgios Drakopoulos¹, Aikaterini Baroutiadi¹, Vasileios Megalooikonomou¹•Institutions (1)

University of Patras¹

06 Jul 2015

TL;DR: An implementation of eigenvector centrality, a prominent member of the broad class of spectralcentrality, in Java and NetBeans designed for use with Neo4j, a major schemaless graph database, is outlined and the findings resulting from its application to a real world social graph are discussed.

...read moreread less

Abstract: Graphs are currently the epicenter of intense research as they lay the theoretical groundwork in diverse fields ranging from combinatorial optimization to computational neuroscience. Vertex centrality plays a crucial role in graph mining as it ranks them according to their contribution to overall graph communication. Specifically, within the social network analysis context centrality identifies influential indivduals, whereas in the bioinformatics field centrality locates dominant proteins in protein-to-protein interaction. In recent years graph databases, part of the rising NoSQL movement, have been added to the graph analysis toolset. An implementation of eigenvector centrality, a prominent member of the broad class of spectral centrality, in Java and NetBeans designed for use with Neo4j, a major schemaless graph database, is outlined and the findings resulting from its application to a real world social graph are discussed.

...read moreread less

Proceedings Article•DOI•

Frappé: Querying the Linux Kernel Dependency Graph

[...]

Nathan Hawes¹, Ben Barham¹, Cristina Cifuentes¹•Institutions (1)

Oracle Corporation¹

31 May 2015

TL;DR: The graph model used by Frappé is detailed and its key use cases are outlined using representative queries and their runtimes with the dependency graph data of the Unbreakable Enterprise Kernel.

...read moreread less

Abstract: Frappe is a developer tool for querying and visualizing the dependencies of large C/C++ software systems to the order of 10s of millions of lines of code in size. It supports developers with a range of code comprehension queries such as Does function X or something it calls write to global variable Y? and How much code could be affected if I change this macro? Results are overlaid on a visualization of the dependency graph data based on a cartographic map metaphor.In this paper, we give a brief overview of Frappe and describe our experiences implementing it on top of the Neo4j graph database. We detail the graph model used by Frappe and outline its key use cases using representative queries and their runtimes with the dependency graph data of the Unbreakable Enterprise Kernel.Finally, we discuss some of the open challenges in supporting source code queries across single and multiple versions of an evolving codebase with current property graph database technologies: performance, efficient storage, and the expressivity of the graph querying language given a graph model.

...read moreread less

Journal Article•DOI•

Graph theory and model collection management: conceptual framework and runtime analysis of selected graph algorithms

[...]

Dominic Breuker, Patrick Delfmann, Hanns-Alexander Dietrich, Matthias Steinhorst

01 Feb 2015-Information Systems and E-business Management

TL;DR: A graph algorithm based model analysis framework that can be accessed by specialized model analysis techniques is introduced and it is proved that basic graph algorithms are feasible to support such a framework.

...read moreread less

Abstract: Analysing conceptual models is a frequent task of business process management (BPM), for instance to support comparison or integration of business processes, to check business processes for compliance or weaknesses, or to tailor conceptual models for different audiences. As recently, many companies have started to maintain large model collections and analysing such collections manually may be laborious, practitioners have articulated a demand for automatic model analysis support. Hence, BPM scholars have proposed a plethora of different model analysis techniques. As virtually any conceptual model can be interpreted as a mathematical graph and model analysis techniques often include some kind of graph problem, in this paper, we introduce a graph algorithm based model analysis framework that can be accessed by specialized model analysis techniques. To prove that basic graph algorithms are feasible to support such a framework, we conduct a performance analysis of selected graph algorithms.

...read moreread less

Proceedings Article•DOI•

Expression and efficient processing of fuzzy queries in a graph database context

[...]

Olivier Pivert, Grégory Smits, Virginie Thion

02 Aug 2015

TL;DR: This paper describes a fuzzy query language, called FUDGE, that can be used to express preference queries on fuzzy graph databases and is implemented in a system, called SUGAR, that is presented in this article.

...read moreread less

Abstract: Graph databases have aroused a large interest in the last years thanks to their large scope of potential applications (e.g. social networks, biomedical networks, data stemming from the web). In a similar way as what has already been proposed in relational databases, defining a language allowing a flexible querying of graph databases may greatly improve usability of data. This paper focuses on the notion of fuzzy graph database and describes a fuzzy query language that makes it possible to handle such database, which may be fuzzy or not, in a flexible way. This language, called FUDGE, can be used to express preference queries on fuzzy graph databases. The preferences concern i) the content of the vertices of the graph and ii) the structure of the graph. The FUDGE language is implemented in a system, called SUGAR, that we present in this article. We also discuss implementation issues of the FUDGE language in SUGAR.

...read moreread less

Patent•

Systems and methods for a query optimization engine

[...]

Patrick Nguyen, Theodore Vassilakis, Sreenivasa Viswanadha, David Kryze

23 Jul 2015

TL;DR: In this article, the authors present a disclosure of systems, methods, and non-transitory computer readable media configured to receive at least one database query to be executed by a distributed computing system.

...read moreread less

Abstract: Various embodiments of the present disclosure can include systems, methods, and non-transitory computer readable media configured to receive at least one database query to be executed. At least one computation graph corresponding to the at least one database query is generated. The computation graph is transformed to an optimized computation graph. The respective portions of the optimized computation graph are distributed to a plurality of distributed computing systems for execution. A result for the at least one database query is provided.

...read moreread less

Patent•

Graph Query Engine for Use within a Cognitive Environment

[...]

Matthew Sanchez, Dilum Ranatunga

24 Feb 2015

TL;DR: An apparatus for use within a cognitive information processing system environment comprising: a graph query engine, the graph querying engine coupled to receive data from a plurality of data sources, and the graph query query engine receiving and processing queries and to bridge the queries into a cognitive graph is described in this article.

...read moreread less

Abstract: An apparatus for use within a cognitive information processing system environment comprising: a graph query engine, the graph query engine coupled to receive data from a plurality of data sources, the graph query engine receiving and processing queries and to bridge the queries into a cognitive graph.

...read moreread less

Proceedings Article•DOI•

A conceptual model for event-sourced graph computing

[...]

Benjamin Erb¹, Frank Kargl¹•Institutions (1)

University of Ulm¹

24 Jun 2015

TL;DR: An extended conceptual model for event-driven computing on graphs is suggested that takes into account the evolution of a graph and enables temporal analyses, processing on previous graph states, and retroactive modifications.

...read moreread less

Abstract: Systems for highly interconnected application domains are increasingly taking advantage of graph-based computing platforms. Existing platforms employ a batch-oriented computing model and neglect near-realtime processing or temporal analysis. We suggest an extended conceptual model for event-driven computing on graphs. It takes into account the evolution of a graph and enables temporal analyses, processing on previous graph states, and retroactive modifications.

...read moreread less

Proceedings Article•

Beta-algebra: Towards a Relational Algebra for Graph Analysis

[...]

Luiz Celso Gomes¹, Bernd Amann, André Santanchè¹•Institutions (1)

State University of Campinas¹

01 Mar 2015

TL;DR: This paper presents ongoing efforts towards a relational algebra extension that offers an operator for graph-based data aggregation, and introduces the current algebra and provides examples of its use.

...read moreread less

Abstract: Graph analysis is an essential tool to understand natural and man-made networks, such as social networks, food webs, transportation infrastructures, etc. Although graph analysis has fomented the development of algorithms, visual tools, and distributed processing frameworks, there is still little support for analysis at the query language level. Current graph query languages are mostly concerned with flexible matching of subgraphs, while graph processing frameworks are mostly concerned with fast parallel execution of instructions. Our goal is to provide analysis capabilities at the language level, allowing more interactive and explorative query-based analysis. In this paper, we present our ongoing efforts towards a relational algebra extension that offers an operator for graph-based data aggregation. The beta (β) operator is composed of four suboperators, which are used to control the path-based aggregations. The β-algebra allows seamless composition of queries that mix relational and graph-based aspects. Here we introduce our current algebra and provide examples of its use. We also show how we are using the analysis strategy in query scenarios. Since the algebra-based query scenario allows for execution plan rewritings, we also discuss our first efforts on equivalence rules for query optimization.

...read moreread less

Patent•

Computing apparatus and method for managing a graph database

[...]

Hu Bo¹, Roger Menday¹•Institutions (1)

Fujitsu¹

31 Dec 2015

TL;DR: In this paper, the integration of non-conceptual data items into a data graph is discussed, where the data graph being composed of graph elements including graph nodes and graph edges.

...read moreread less

Abstract: Embodiments include a computing apparatus configured to automate the integration of non-conceptual data items into a data graph, the data graph being composed of graph elements including graph nodes and graph edges, the computing apparatus comprising: a data storage system configured to store, as a graph node of the data graph for each of a plurality of non-conceptual data items, a behaviour handler defining a procedure for using the non-conceptual data item to update the data graph in response to an occurrence of a specified trigger event, the graph node representing the behaviour handler being stored in association with the non-conceptual data item; an execution module configured to execute the procedure defined by a behaviour handler from among the behaviour handlers in response to an occurrence of the specified trigger event for the behaviour handler; a modification identification module configured to identify graph elements modified as a consequence of the execution of the procedure, and to record the identified graph elements as members of a set of modifications attributed to the behaviour handler defining the executed procedure; an inference module configured to infer relationships between behaviour handlers by, for each pair of behaviour handlers defining executed procedures, analysing the sets of modifications attributed to the pair of behaviour handlers in order to identify relationships between the sets of modifications, and adding the identified relationships to the data graph as edges between the graph nodes representing the respective behaviour handlers.

...read moreread less