scispace - formally typeset
Search or ask a question

Showing papers on "Wait-for graph published in 2015"


Journal ArticleDOI
TL;DR: In this survey, the vertex-centric approach to graph processing is overviewed, TLAV frameworks are deconstructed into four main components and respectively analyzed, and TLAV implementations are reviewed and categorized.
Abstract: The vertex-centric programming model is an established computational paradigm recently incorporated into distributed processing frameworks to address challenges in large-scale graph processing. Billion-node graphs that exceed the memory capacity of commodity machines are not well supported by popular Big Data tools like MapReduce, which are notoriously poor performing for iterative graph algorithms such as PageRank. In response, a new type of framework challenges one to “think like a vertex” (TLAV) and implements user-defined programs from the perspective of a vertex rather than a graph. Such an approach improves locality, demonstrates linear scalability, and provides a natural way to express and compute many iterative graph algorithms. These frameworks are simple to program and widely applicable but, like an operating system, are composed of several intricate, interdependent components, of which a thorough understanding is necessary in order to elicit top performance at scale. To this end, the first comprehensive survey of TLAV frameworks is presented. In this survey, the vertex-centric approach to graph processing is overviewed, TLAV frameworks are deconstructed into four main components and respectively analyzed, and TLAV implementations are reviewed and categorized.

267 citations


Book
10 Jun 2015
TL;DR: This second edition of this practical book includes new code samples and diagrams, using the latest Neo4j syntax, as well as information on new functionality.
Abstract: Discover how graph databases can help you manage and query highly connected data. With this practical book, youll learn how to design and implement a graph database that brings the power of graphs to bear on a broad range of problem domains. Whether you want to speed up your response to user queries or build a database that can adapt as your business evolves, this book shows you how to apply the schema-free graph model to real-world problems.This second edition includes new code samples and diagrams, using the latest Neo4j syntax, as well as information on new functionality. Learn how different organizations are using graph databases to outperform their competitors. With this books data modeling, query, and code examples, youll quickly be able to implement your own solution.Model data with the Cypher query language and property graph modelLearn best practices and common pitfalls when modeling with graphsPlan and implement a graph database solution in test-driven fashionExplore real-world examples to learn how and why organizations use a graph databaseUnderstand common patterns and components of graph database architectureUse analytical techniques and algorithms to mine graph database information

184 citations


Proceedings ArticleDOI
04 Oct 2015
TL;DR: There is substantial room for a different processor architecture to improve performance without requiring a new memory system in high-performance graph algorithm codebases using hardware performance counters on a conventional dual-socket server.
Abstract: Graph processing is an increasingly important application domain and is typically communication-bound. In this work, we analyze the performance characteristics of three high-performance graph algorithm codebases using hardware performance counters on a conventional dual-socket server. Unlike many other communication-bound workloads, graph algorithms struggle to fully utilize the platform's memory bandwidth and so increasing memory bandwidth utilization could be just as effective as decreasing communication. Based on our observations of simultaneous low compute and bandwidth utilization, we find there is substantial room for a different processor architecture to improve performance without requiring a new memory system.

164 citations


Proceedings ArticleDOI
15 Nov 2015
TL;DR: This paper characterized GraphBIG on real machines and observed extremely irregular memory patterns and significant diverse behavior across different computations, helping users understand the impact of modern graph computing on the hardware architecture and enables future architecture and system research.
Abstract: With the emergence of data science, graph computing is becoming a crucial tool for processing big connected data. Although efficient implementations of specific graph applications exist, the behavior of full-spectrum graph computing remains unknown. To understand graph computing, we must consider multiple graph computation types, graph frameworks, data representations, and various data sources in a holistic way. In this paper, we present GraphBIG, a benchmark suite inspired by IBM System G project. To cover major graph computation types and data sources, GraphBIG selects representative datastructures, workloads and data sets from 21 real-world use cases of multiple application domains. We characterized GraphBIG on real machines and observed extremely irregular memory patterns and significant diverse behavior across different computations. GraphBIG helps users understand the impact of modern graph computing on the hardware architecture and enables future architecture and system research.

141 citations


Proceedings ArticleDOI
15 Nov 2015
TL;DR: This paper presents a fast distributed graph processing system, namely PGX.D, as a low-overhead, bandwidth-efficient communication framework that supports remote data-pulling patterns and recommends the use of balanced beefy clusters where the sustained random DRAM-access bandwidth in aggregate is matched with the bandwidth of the underlying interconnection fabric.
Abstract: Graph analysis is a powerful method in data analysis. Although several frameworks have been proposed for processing large graph instances in distributed environments, their performance is much lower than using efficient single-machine implementations provided with enough memory. In this paper, we present a fast distributed graph processing system, namely PGX.D. We show that PGX.D outperforms other distributed graph systems like GraphLab significantly (3x -- 90x). Furthermore, PGX.D on 4 to 16 machines is also faster than an implementation optimized for single-machine execution. Using a fast cooperative context-switching mechanism, we implement PGX.D as a low-overhead, bandwidth-efficient communication framework that supports remote data-pulling patterns. Moreover, PGX.D achieves large traffic reduction and good workload balance by applying selective ghost nodes, edge partitioning, and edge chunking transparently to the user. Our analysis confirms that each of these features is indeed crucial for overall performance of certain kinds of graph algorithms. Finally, we advocate the use of balanced beefy clusters where the sustained random DRAM-access bandwidth in aggregate is matched with the bandwidth of the underlying interconnection fabric.

87 citations


Proceedings ArticleDOI
12 Nov 2015
TL;DR: This work outlines a Graph BLAS-like, linear algebra based approach to miniTri, one such miniapp, and describes a task-based prototype implementation and gives initial scalability results.
Abstract: It is challenging to obtain scalable HPC performance on real applications, especially for data science applications with irregular memory access and computation patterns. To drive co-design efforts in architecture, system, and application design, we are developing miniapps representative of data science workloads. These in turn stress the state of the art in Graph BLAS-like Graph Algorithm Building Blocks (GABB). In this work, we outline a Graph BLAS-like, linear algebra based approach to miniTri, one such miniapp. We describe a task-based prototype implementation and give initial scalability results.

38 citations


Journal ArticleDOI
01 Aug 2015
TL;DR: This work demonstrates a system, called GraphGen, that enables users to declaratively specify graph extraction tasks over relational databases, visually explore the extracted graphs, and write and execute graph algorithms over them, either directly or using existing graph libraries like the widely used NetworkX Python library.
Abstract: Analyzing interconnection structures among the data through the use of graph algorithms and graph analytics has been shown to provide tremendous value in many application domains. However, graphs are not the primary choice for how most data is currently stored, and users who want to employ graph analytics are forced to extract data from their data stores, construct the requisite graphs, and then use a specialized engine to write and execute their graph analysis tasks. This cumbersome and costly process not only raises barriers in using graph analytics, but also makes it hard to explore and identify hidden or implicit graphs in the data. Here we demonstrate a system, called GraphGen, that enables users to declaratively specify graph extraction tasks over relational databases, visually explore the extracted graphs, and write and execute graph algorithms over them, either directly or using existing graph libraries like the widely used NetworkX Python library. We also demonstrate how unifying the extraction tasks and the graph algorithms enables significant optimizations that would not be possible otherwise.

38 citations


Proceedings ArticleDOI
29 Oct 2015
TL;DR: It is shown that vertex-centric graph analysis can be translated to SQL queries, typically involving table scans and joins, and that modern column-oriented databases are very well suited to running such queries.
Abstract: Graph analytics is becoming increasingly popular, with a number of new applications and systems developed in the past few years In this paper, we study Vertica relational database as a platform for graph analytics We show that vertex-centric graph analysis can be translated to SQL queries, typically involving table scans and joins, and that modern column-oriented databases are very well suited to running such queries Furthermore, we show how developers can trade memory footprint for significantly reduced I/O costs in Vertica We present an experimental evaluation of the Vertica relational database system on a variety of graph analytics, including iterative analysis, a combination of graph and relational analyses, and more complex 1-hop neighborhood graph analytics, showing that it is competitive to two popular vertex-centric graph analytics systems, namely Giraph and GraphLab

33 citations


Proceedings ArticleDOI
29 Jun 2015
TL;DR: In this article, an extensible graph traversal framework (GRAPHITE) is proposed as a central graph processing component on a common storage engine inside a relational database management system, and two traversal algorithm implementations are derived for different graph topologies and traversal queries.
Abstract: Graph traversals are a basic but fundamental ingredient for a variety of graph algorithms and graph-oriented queries. To achieve the best possible query performance, they need to be implemented at the core of a database management system that aims at storing, manipulating, and querying graph data. Increasingly, modern business applications demand native graph query and processing capabilities for enterprise-critical operations on data stored in relational database management systems. In this paper we propose an extensible graph traversal framework (GRAPHITE) as a central graph processing component on a common storage engine inside a relational database management system.We study the influence of the graph topology on the execution time of graph traversals and derive two traversal algorithm implementations specialized for different graph topologies and traversal queries. We conduct extensive experiments on GRAPHITE for a large variety of real-world graph data sets and input configurations. Our experiments show that the proposed traversal algorithms differ by up to two orders of magnitude for different input configurations and therefore demonstrate the need for a versatile framework to efficiently process graph traversals on a wide range of different graph topologies and types of queries. Finally, we highlight that the query performance of our traversal implementations is competitive with those of two native graph database management systems.

31 citations


Proceedings ArticleDOI
01 Jan 2015
TL;DR: This paper introduces an innovative concurrent graph model that provides addition and removal of any edge of the graph, as well as atomic traversals of a part (or the entirety), and presents Dense, a concurrent graph implementation that aims at mitigating the two aforementioned implementation drawbacks.
Abstract: Graphs are versatile data structures that allow the implementation of a variety of applications, such as computer-aided design and manufacturing, video gaming, or scientific simulations. However, although data structures such as queues, stacks, and trees have been widely studied and implemented in the concurrent context, multi-process applications that rely on graphs still largely use a sequential implementation where accesses are synchronized through the use of global locks or partitioning, thus imposing serious performance bottlenecks. In this paper we introduce an innovative concurrent graph model that provides addition and removal of any edge of the graph, as well as atomic traversals of a part (or the entirety) of the graph. We further present Dense, a concurrent graph implementation that aims at mitigating the two aforementioned implementation drawbacks. Dense achieves wait-freedom by relying on helping and provides the inbuilt capability of performing a partial snapshot on a dynamically determined subset of the graph.

30 citations


Proceedings ArticleDOI
15 May 2015
TL;DR: An overview of the different type of graph databases, applications, and comparison between their models based on some properties is given.
Abstract: In the era of big data, data analytics, business intelligence database management plays a vital role from technical business management and research point of view. Over many decades, database management has been a topic of active research. There are different type of database management system have been proposed over a period of time but Relational Database Management System (RDBMS) is the one which has been most popularly used in academic research as well as industrial setup[1]. In recent years, graph databases regained interest among the researchers for certain obvious reasons. One of the most important reasons for such an interest in a graph database is because of the inherent property of graphs as a graph structure. Graphs are present everywhere in the data structure, which represents the strong connectivity within the data. Most of the graph database models are defined in which data-structure for schema and instances are modeled as graph or generalization of a graph. In such graph database models, data manipulations are expressed by graph-oriented operations and type constructors [9]. Now days, most of the real world applications can be modeled as a graph and one of the best real world examples is social or biological network. This paper gives an overview of the different type of graph databases, applications, and comparison between their models based on some properties.

Journal ArticleDOI
TL;DR: Lower bounds for the coupling strengths of oscillators in directed networks to guarantee global synchronization are proposed and examples are provided to demonstrate how to apply the proposed methodologies to typical directed complex networks.
Abstract: This paper proposes lower bounds for the coupling strengths of oscillators in directed networks to guarantee global synchronization. The novel idea of graph comparison from spectral graph theory is employed so that the combinatorial features of a given network can be fully utilized to simplify computations. For large networks that can be decomposed into a set of smaller strongly connected components, the comparison can be carried out at the local level as well. To validate theoretical analysis, examples are provided to demonstrate how to apply the proposed methodologies to typical directed complex networks.

Proceedings ArticleDOI
04 Oct 2015
TL;DR: It is shown that efficient building block operators enable more powerful operations for fast information propagation and result in fewer device kernel invocations, less data movement, and fewer global synchronizations, and thus are key focus areas for efficient large-scale graph analytics on the GPU.
Abstract: We identify several factors that are critical to high-performance GPU graph analytics: efficient building block operators, synchronization and data movement, workload distribution and load balancing, and memory access patterns. We analyze the impact of these critical factors through three GPU graph analytic frameworks, Gun rock, Map Graph, and VertexAPI2. We also examine their effect on different workloads: four common graph primitives from multiple graph application domains, evaluated through real-world and synthetic graphs. We show that efficient building block operators enable more powerful operations for fast information propagation and result in fewer device kernel invocations, less data movement, and fewer global synchronizations, and thus are key focus areas for efficient large-scale graph analytics on the GPU.

Proceedings ArticleDOI
24 Jan 2015
TL;DR: A core language expressive enough to represent the three most widespread barrier synchronisation patterns: group, split-phase, and dynamic membership is introduced, and a graph analysis technique that selects from two alternative graph representations is proposed, which prove that finding a deadlock in either representation is equivalent.
Abstract: We present Armus, a dynamic verification tool for deadlock detection and avoidance specialised in barrier synchronisation Barriers are used to coordinate the execution of groups of tasks, and serve as a building block of parallel computing Our tool verifies more barrier synchronisation patterns than current state-of-the-art To improve the scalability of verification, we introduce a novel event-based representation of concurrency constraints, and a graph-based technique for deadlock analysis The implementation is distributed and fault-tolerant, and can verify X10 and Java programs To formalise the notion of barrier deadlock, we introduce a core language expressive enough to represent the three most widespread barrier synchronisation patterns: group, split-phase, and dynamic membership We propose a graph analysis technique that selects from two alternative graph representations: the Wait-For Graph, that favours programs with more tasks than barriers; and the State Graph, optimised for programs with more barriers than tasks We prove that finding a deadlock in either representation is equivalent, and that the verification algorithm is sound and complete with respect to the notion of deadlock in our core language Armus is evaluated with three benchmark suites in local and distributed scenarios The benchmarks show that graph analysis with automatic graph-representation selection can record a 7-fold execution increase versus the traditional fixed graph representation The performance measurements for distributed deadlock detection between 64 processes show negligible overheads

Proceedings ArticleDOI
08 Sep 2015
TL;DR: This work describes the design and implementation of a new graph processing system based on Bulk Synchronous Parallel model, built on top of ZHT, a scalable distributed key-value store, which benefits the graph processing in terms of scalability, performance and persistency.
Abstract: The emerging applications in big data and social networks issue rapidly increasing demands on graph processing. Graph query operations that involve a large number of vertices and edges can be tremendously slow on traditional databases. The state-of-the-art graph processing systems and databases usually adopt master/slave architecture that potentially impairs their The contributions of this paper are as follows: scalability. This work describes the design and implementation of a new graph processing system based on Bulk Synchronous Parallel model. Our system is built on top of ZHT, a scalable distributed key-value store, which benefits the graph processing in terms of scalability, performance and persistency. The experiment results imply excellent scalability.

Proceedings ArticleDOI
07 Dec 2015
TL;DR: This method provides a general solution to the problem of mining hierarchical models from unannotated visual data without exhaustive search of objects and is applied to RGB/RGB-D images and videos to demonstrate its generality and the wide range of applicability.
Abstract: This paper reformulates the theory of graph mining on the technical basis of graph matching, and extends its scope of applications to computer vision. Given a set of attributed relational graphs (ARGs), we propose to use a hierarchical And-Or Graph (AoG) to model the pattern of maximal-size common subgraphs embedded in the ARGs, and we develop a general method to mine the AoG model from the unlabeled ARGs. This method provides a general solution to the problem of mining hierarchical models from unannotated visual data without exhaustive search of objects. We apply our method to RGB/RGB-D images and videos to demonstrate its generality and the wide range of applicability. The code will be available at https://sites.google.com/site/quanshizhang/mining-and-or-graphs.

Posted Content
TL;DR: A graph pattern engine, EmptyHeaded, is presented that uses recent algorithmic advances in join processing to compile patterns into Boolean algebra operations that exploit SIMD parallelism and outperforms specialized graph engines by over an order of magnitude and relational systems by over two orders of magnitude.
Abstract: We present a graph pattern engine, EmptyHeaded, that uses recent algorithmic advances in join processing to compile patterns into Boolean algebra operations that exploit SIMD parallelism. The EmptyHeaded engine demonstrates that treating graph patterns as a general join processing problem can compete with and often outperform both specialized approaches and existing OLAP systems on graph queries. The core Boolean algebra operation performed in EmptyHeaded is set intersection. Extracting SIMD parallelism during set intersections on graph data is challenging because graph data can be skewed in several dierent ways. Our contributions are a demonstration of this new type of engine with Boolean algebra at its core, an exploration of set intersection representations and algorithms for set intersections that are optimized for skew. We demonstrate that EmptyHeaded outperforms specialized graph engines by over an order of magnitude and relational systems by over two orders of magnitude. Our results suggest that this new style of engine is a promising new direction for future graph engines and accelerators.

Book ChapterDOI
03 Jul 2015
TL;DR: An overview of recent results and techniques in parameterized algorithms for graph modification problems is given.
Abstract: We give an overview of recent results and techniques in parameterized algorithms for graph modification problems.

Proceedings ArticleDOI
30 Mar 2015
TL;DR: This paper introduces Table2Graph, the graph construction tool based on Map-Reduce framework over Hadoop that turns multiple disparate data sources into a single heterogeneous graph model so that matching between entities across different source data would be expedited by examining their linkages in the graph.
Abstract: Identifying correlations and relationships between entities within and across different data sets (or databases) is of great importance in many domains. The data warehouse-based integration, which has been most widely practiced, is found to be inadequate to achieve such a goal. Instead we explored an alternate solution that turns multiple disparate data sources into a single heterogeneous graph model so that matching between entities across different source data would be expedited by examining their linkages in the graph. We found, however, while a graph-based model provides outstanding capabilities for this purposes, construction of one such model from relational source databases were time consuming and primarily left to ad hoc proprietary scripts. This led us to develop a reconfigurable and reusable graph construction tool that is designed to work at scale. In this paper, we introduce Table2Graph, the graph construction tool based on Map-Reduce framework over Hadoop. We also discuss results from applying Table2Graph to integrate disparate healthcare databases.

Patent
09 Apr 2015
TL;DR: In this paper, the authors propose a homoiconic or executable graph framework for representing complex heterogeneous characteristics of processes, systems, and systems of systems that feature many to many interrelationships.
Abstract: The present invention addresses deficiencies of the art with respect to collaborative computer networks consisting of mixed data, control functions, analysis functions, and sensors in complex systems of systems. The method involves a database framework for representing complex heterogeneous characteristics of processes, systems, and systems of systems that feature many to many interrelationships. The homoiconic graph framework takes the form of an executable graph database, which is often faster for associative data sets, and maps more directly to object-oriented computer applications for large-scale operations. The invention provides a method to execute the graph database, in that it comprises nodes that are both data fragments and executable components. It is characterized as a homoiconic or executable graph framework to distinguish this unique feature from the concept of a graph database, which generally is a repository of connected data only.

01 Jan 2015
TL;DR: The aim of this paper is to evaluate, through indexing techniques, the performance of Neo4j and OrientDB, both graph databases technologies and to come up with strength and weaknesses os each technology as a candidate for a storage mechanism of a graph structure.
Abstract: The aim of this paper is to evaluate , through indexing techniques, the performance of Neo4j and OrientDB, both graph databases technologies and to come up with strength and weaknesses os each technology as a candidate for a storage mechanism of a graph structure. An index is a data structure that makes the searching faster for a specific node in concern of graph databases. The referred data structure is habitually a B-tree, however, can be a hash table or some other logic structure as well. The pivotal point of having an index is to speed up search queries, primarily by reducing the number of nodes in a graph or table to be examined. Graphs and graph databases are more commonly associated with social networking or “graph search” style recommendations. Thus, these technologies remarkably are a core technology platform for some Internet giants like Hi5, Facebook, Google, Badoo, Twitter and LinkedIn. The key to understanding graph database systems, in the social networking context, is they give equal prominence to storing both the data (users, favorites) and the relationships between them (who liked what, who ‘follows’ whom, which post was liked the most, what is the shortest path to ‘reach’ who). By a suitable application case study, in case a Twitter social networking of almost 5,000 nodes imported in local servers (Neo4j and Orient-DB), one queried to retrieval the node with the searched data, first without index (full scan), and second with index, aiming at comparing the response time (statement query time) of the aforementioned graph databases and find out which of them has a better performance (the speed of data or information retrieval) and in which case. Thereof, the main results are presented in the section 6.

Proceedings ArticleDOI
06 Jul 2015
TL;DR: An implementation of eigenvector centrality, a prominent member of the broad class of spectralcentrality, in Java and NetBeans designed for use with Neo4j, a major schemaless graph database, is outlined and the findings resulting from its application to a real world social graph are discussed.
Abstract: Graphs are currently the epicenter of intense research as they lay the theoretical groundwork in diverse fields ranging from combinatorial optimization to computational neuroscience. Vertex centrality plays a crucial role in graph mining as it ranks them according to their contribution to overall graph communication. Specifically, within the social network analysis context centrality identifies influential indivduals, whereas in the bioinformatics field centrality locates dominant proteins in protein-to-protein interaction. In recent years graph databases, part of the rising NoSQL movement, have been added to the graph analysis toolset. An implementation of eigenvector centrality, a prominent member of the broad class of spectral centrality, in Java and NetBeans designed for use with Neo4j, a major schemaless graph database, is outlined and the findings resulting from its application to a real world social graph are discussed.

Proceedings ArticleDOI
31 May 2015
TL;DR: The graph model used by Frappé is detailed and its key use cases are outlined using representative queries and their runtimes with the dependency graph data of the Unbreakable Enterprise Kernel.
Abstract: Frappe is a developer tool for querying and visualizing the dependencies of large C/C++ software systems to the order of 10s of millions of lines of code in size. It supports developers with a range of code comprehension queries such as Does function X or something it calls write to global variable Y? and How much code could be affected if I change this macro? Results are overlaid on a visualization of the dependency graph data based on a cartographic map metaphor.In this paper, we give a brief overview of Frappe and describe our experiences implementing it on top of the Neo4j graph database. We detail the graph model used by Frappe and outline its key use cases using representative queries and their runtimes with the dependency graph data of the Unbreakable Enterprise Kernel.Finally, we discuss some of the open challenges in supporting source code queries across single and multiple versions of an evolving codebase with current property graph database technologies: performance, efficient storage, and the expressivity of the graph querying language given a graph model.

Journal ArticleDOI
TL;DR: A graph algorithm based model analysis framework that can be accessed by specialized model analysis techniques is introduced and it is proved that basic graph algorithms are feasible to support such a framework.
Abstract: Analysing conceptual models is a frequent task of business process management (BPM), for instance to support comparison or integration of business processes, to check business processes for compliance or weaknesses, or to tailor conceptual models for different audiences. As recently, many companies have started to maintain large model collections and analysing such collections manually may be laborious, practitioners have articulated a demand for automatic model analysis support. Hence, BPM scholars have proposed a plethora of different model analysis techniques. As virtually any conceptual model can be interpreted as a mathematical graph and model analysis techniques often include some kind of graph problem, in this paper, we introduce a graph algorithm based model analysis framework that can be accessed by specialized model analysis techniques. To prove that basic graph algorithms are feasible to support such a framework, we conduct a performance analysis of selected graph algorithms.

Proceedings ArticleDOI
02 Aug 2015
TL;DR: This paper describes a fuzzy query language, called FUDGE, that can be used to express preference queries on fuzzy graph databases and is implemented in a system, called SUGAR, that is presented in this article.
Abstract: Graph databases have aroused a large interest in the last years thanks to their large scope of potential applications (e.g. social networks, biomedical networks, data stemming from the web). In a similar way as what has already been proposed in relational databases, defining a language allowing a flexible querying of graph databases may greatly improve usability of data. This paper focuses on the notion of fuzzy graph database and describes a fuzzy query language that makes it possible to handle such database, which may be fuzzy or not, in a flexible way. This language, called FUDGE, can be used to express preference queries on fuzzy graph databases. The preferences concern i) the content of the vertices of the graph and ii) the structure of the graph. The FUDGE language is implemented in a system, called SUGAR, that we present in this article. We also discuss implementation issues of the FUDGE language in SUGAR.

Patent
23 Jul 2015
TL;DR: In this article, the authors present a disclosure of systems, methods, and non-transitory computer readable media configured to receive at least one database query to be executed by a distributed computing system.
Abstract: Various embodiments of the present disclosure can include systems, methods, and non-transitory computer readable media configured to receive at least one database query to be executed. At least one computation graph corresponding to the at least one database query is generated. The computation graph is transformed to an optimized computation graph. The respective portions of the optimized computation graph are distributed to a plurality of distributed computing systems for execution. A result for the at least one database query is provided.

Patent
24 Feb 2015
TL;DR: An apparatus for use within a cognitive information processing system environment comprising: a graph query engine, the graph querying engine coupled to receive data from a plurality of data sources, and the graph query query engine receiving and processing queries and to bridge the queries into a cognitive graph is described in this article.
Abstract: An apparatus for use within a cognitive information processing system environment comprising: a graph query engine, the graph query engine coupled to receive data from a plurality of data sources, the graph query engine receiving and processing queries and to bridge the queries into a cognitive graph.

Proceedings ArticleDOI
24 Jun 2015
TL;DR: An extended conceptual model for event-driven computing on graphs is suggested that takes into account the evolution of a graph and enables temporal analyses, processing on previous graph states, and retroactive modifications.
Abstract: Systems for highly interconnected application domains are increasingly taking advantage of graph-based computing platforms. Existing platforms employ a batch-oriented computing model and neglect near-realtime processing or temporal analysis. We suggest an extended conceptual model for event-driven computing on graphs. It takes into account the evolution of a graph and enables temporal analyses, processing on previous graph states, and retroactive modifications.

Proceedings Article
01 Mar 2015
TL;DR: This paper presents ongoing efforts towards a relational algebra extension that offers an operator for graph-based data aggregation, and introduces the current algebra and provides examples of its use.
Abstract: Graph analysis is an essential tool to understand natural and man-made networks, such as social networks, food webs, transportation infrastructures, etc. Although graph analysis has fomented the development of algorithms, visual tools, and distributed processing frameworks, there is still little support for analysis at the query language level. Current graph query languages are mostly concerned with flexible matching of subgraphs, while graph processing frameworks are mostly concerned with fast parallel execution of instructions. Our goal is to provide analysis capabilities at the language level, allowing more interactive and explorative query-based analysis. In this paper, we present our ongoing efforts towards a relational algebra extension that offers an operator for graph-based data aggregation. The beta (β) operator is composed of four suboperators, which are used to control the path-based aggregations. The β-algebra allows seamless composition of queries that mix relational and graph-based aspects. Here we introduce our current algebra and provide examples of its use. We also show how we are using the analysis strategy in query scenarios. Since the algebra-based query scenario allows for execution plan rewritings, we also discuss our first efforts on equivalence rules for query optimization.

Patent
Hu Bo1, Roger Menday1
31 Dec 2015
TL;DR: In this paper, the integration of non-conceptual data items into a data graph is discussed, where the data graph being composed of graph elements including graph nodes and graph edges.
Abstract: Embodiments include a computing apparatus configured to automate the integration of non-conceptual data items into a data graph, the data graph being composed of graph elements including graph nodes and graph edges, the computing apparatus comprising: a data storage system configured to store, as a graph node of the data graph for each of a plurality of non-conceptual data items, a behaviour handler defining a procedure for using the non-conceptual data item to update the data graph in response to an occurrence of a specified trigger event, the graph node representing the behaviour handler being stored in association with the non-conceptual data item; an execution module configured to execute the procedure defined by a behaviour handler from among the behaviour handlers in response to an occurrence of the specified trigger event for the behaviour handler; a modification identification module configured to identify graph elements modified as a consequence of the execution of the procedure, and to record the identified graph elements as members of a set of modifications attributed to the behaviour handler defining the executed procedure; an inference module configured to infer relationships between behaviour handlers by, for each pair of behaviour handlers defining executed procedures, analysing the sets of modifications attributed to the pair of behaviour handlers in order to identify relationships between the sets of modifications, and adding the identified relationships to the data graph as edges between the graph nodes representing the respective behaviour handlers.