scispace - formally typeset
Search or ask a question

Showing papers on "Wait-for graph published in 2017"


Proceedings ArticleDOI
26 Jun 2017
TL;DR: In this paper, the authors investigate the applicability of push-pull dichotomy to various algorithms and its impact on complexity, performance, and the amount of used locks, atomics, and reads/writes.
Abstract: We reduce the cost of communication and synchronization in graph processing by analyzing the fastest way to process graphs: pushing the updates to a shared state or pulling the updates to a private state. We investigate the applicability of this push-pull dichotomy to various algorithms and its impact on complexity, performance, and the amount of used locks, atomics, and reads/writes. We consider 11 graph algorithms, 3 programming models, 2 graph abstractions, and various families of graphs. The conducted analysis illustrates surprising differences between push and pull variants of different algorithms in performance, speed of convergence, and code complexity; the insights are backed up by performance data from hardware counters. We use these findings to illustrate which variant is faster for each algorithm and to develop generic strategies that enable even higher speedups. Our insights can be used to accelerate graph processing engines or libraries on both massively-parallel shared-memory machines as well as distributed-memory systems.

124 citations


Journal ArticleDOI
TL;DR: Signed graph-based multi-agent systems provide models to opinion dynamics and social networks, and may also hold significance in further developing such internet search algorithms as PageRank to counter spamming websites.
Abstract: This paper investigates sign-consensus problems of general linear multi-agent systems. The interaction between agents is modeled by a signed directed graph, where both cooperation and competition coexist within a group. The graph is allowed to be structurally unbalanced and its adjacency matrix is assumed to be eventually positive. Distributed control laws are proposed for several classes of graph topologies. Simulation examples are provided to illustrate the proposed control laws. Signed graph-based multi-agent systems provide models to opinion dynamics and social networks, and may also hold significance in further developing such internet search algorithms as PageRank to counter spamming websites.

93 citations


Journal ArticleDOI
01 Apr 2017
TL;DR: GMark as discussed by the authors is a domain and query language-independent graph instance and query workload generator that targets and controls the diversity of properties of both the generated instances and the generated workloads coupled to these instances.
Abstract: Massive graph data sets are pervasive in contemporary application domains. Hence, graph database systems are becoming increasingly important. In the experimental study of these systems, it is vital that the research community has shared solutions for the generation of database instances and query workloads having predictable and controllable properties. In this paper, we present the design and engineering principles of $\mathsf {gMark}$ , a domain- and query language-independent graph instance and query workload generator. A core contribution of $\mathsf {gMark}$ is its ability to target and control the diversity of properties of both the generated instances and the generated workloads coupled to these instances. Further novelties include support for regular path queries, a fundamental graph query paradigm, and schema-driven selectivity estimation of queries, a key feature in controlling workload chokepoints. We illustrate the flexibility and practical usability of $\mathsf {gMark}$ by showcasing the framework's capabilities in generating high quality graphs and workloads, and its ability to encode user-defined schemas across a variety of application domains.

71 citations


Journal ArticleDOI
01 Aug 2017
TL;DR: ExtraV is a framework for near-storage graph processing based on the novel concept of graph virtualization, which efficiently utilizes a cache-coherent hardware accelerator at the storage side to achieve performance and flexibility at the same time.
Abstract: In this paper, we propose ExtraV, a framework for near-storage graph processing. It is based on the novel concept of graph virtualization, which efficiently utilizes a cache-coherent hardware accelerator at the storage side to achieve performance and flexibility at the same time. ExtraV consists of four main components: 1) host processor, 2) main memory, 3) AFU (Accelerator Function Unit) and 4) storage. The AFU, a hardware accelerator, sits between the host processor and storage. Using a coherent interface that allows main memory accesses, it performs graph traversal functions that are common to various algorithms while the program running on the host processor (called the host program) manages the overall execution along with more application-specific tasks. Graph virtualization is a high-level programming model of graph processing that allows designers to focus on algorithm-specific functions. Realized by the accelerator, graph virtualization gives the host programs an illusion that the graph data reside on the main memory in a layout that fits with the memory access behavior of host programs even though the graph data are actually stored in a multi-level, compressed form in storage. We prototyped ExtraV on a Power8 machine with a CAPI-enabled FPGA. Our experiments on a real system prototype offer significant speedup compared to state-of-the-art software only implementations.

61 citations


Proceedings Article
12 Jul 2017
TL;DR: It is demonstrated that NUMA-awareness and its attendant pre-processing costs are beneficial only on large machines and for certain algorithms, calling into question the benefits of proposed algorithmic optimizations that rely on extensive preprocessing.
Abstract: Graph processing systems are used in a wide variety of fields, ranging from biology to social networks, and a large number of such systems have been described in the recent literature. We perform a systematic comparison of various techniques proposed to speed up in-memory multicore graph processing. In addition, we take an end-to-end view of execution time, including not only algorithm execution time, but also pre-processing time and the time to load the graph input data from storage. More specifically, we study various data structures to represent the graph in memory, various approaches to pre-processing and various ways to structure the graph computation. We also investigate approaches to improve cache locality, synchronization, and NUMA-awareness. In doing so, we take our inspiration from a number of graph processing systems, and implement the techniques they propose in a single system. We then selectively enable different techniques, allowing us to assess their benefits in isolation and independent of unrelated implementation considerations. Our main observation is that the cost of pre-processing in many circumstances dominates the cost of algorithm execution, calling into question the benefits of proposed algorithmic optimizations that rely on extensive preprocessing. Equally surprising, using radix sort turns out to be the most efficient way of pre-processing the graph input data into adjacency lists, when the graph input data is already in memory or is loaded from fast storage. Furthermore, we adapt a technique developed for out-of-core graph processing, and show that it significantly improves cache locality. Finally, we demonstrate that NUMA-awareness and its attendant pre-processing costs are beneficial only on large machines and for certain algorithms.

58 citations


Proceedings ArticleDOI
09 May 2017
TL;DR: This work revisits SQL recursive queries and shows that the 4 operations with others are ensured to have a fixpoint, following the techniques studied in DATALOG, and enhances the recursive WITH clause in SQL'99.
Abstract: To support analytics on massive graphs such as online social networks, RDF, Semantic Web, etc. many new graph algorithms are designed to query graphs for a specific problem, and many distributed graph processing systems are developed to support graph querying by programming. In this paper, we focus on RDBM, which has been well studied over decades to manage large datasets, and we revisit the issue how RDBM can support graph processing at the SQL level. Our work is motivated by the fact that there are many relations stored in RDBM that are closely related to a graph in real applications and need to be used together to query the graph, and RDBM is a system that can query and manage data while data may be updated over time. To support graph processing, in this work, we propose 4 new relational algebra operations, MM-join, MV-join, anti-join, and union-by-update. Here, MM-join and MV-join are join operations between two matrices and between a matrix and a vector, respectively, followed by aggregation computing over groups, given a matrix/vector can be represented by a relation. Both deal with the semiring by which many graph algorithms can be supported. The anti-join removes nodes/edges in a graph when they are unnecessary for the following computing. The union-by-update addresses value updates to compute PageRank, for example. The 4 new relational algebra operations can be defined by the 6 basic relational algebra operations with group-by & aggregation. We revisit SQL recursive queries and show that the 4 operations with others are ensured to have a fixpoint, following the techniques studied in DATALOG, and enhance the recursive WITH clause in SQL'99. We conduct extensive performance studies to test 10 graph algorithms using 9 large real graphs in 3 major RDBMs. We show that RDBMs are capable of dealing with graph processing in reasonable time. The focus of this work is at SQL level. There is high potential to improve the efficiency by main-memory RDBMs, efficient join processing in parallel, and new storage management.

45 citations


Proceedings ArticleDOI
25 Jun 2017
TL;DR: This paper addresses the problem of breaking cycles while preserving the logical structure (hierarchy) of a directed graph as much as possible, and infers graph hierarchy using a range of features, including a Bayesian skill rating system and a social agony metric.
Abstract: Taxonomy graphs that capture hyponymy or meronymy relationships through directed edges are expected to be acyclic. However, in practice, they may have thousands of cycles, as they are often created in a crowd-sourced way. Since these cycles represent logical fallacies, they need to be removed for many web applications. In this paper, we address the problem of breaking cycles while preserving the logical structure (hierarchy) of a directed graph as much as possible. Existing approaches for this problem either need manual intervention or use heuristics that can critically alter the taxonomy structure. In contrast, our approach infers graph hierarchy using a range of features, including a Bayesian skill rating system and a social agony metric. We also devise several strategies to leverage the inferred hierarchy for removing a small subset of edges to make the graph acyclic. Extensive experiments demonstrate the effectiveness of our approach.

31 citations


Proceedings ArticleDOI
19 May 2017
TL;DR: This paper presents an graph database benchmarking architecture built on the existing LDBC Social Network Benchmark, and evaluates a selection of specialized graph databases, RDF stores, and RDBMSes adapted for graphs to find that specializedgraph databases provide definitively better performance.
Abstract: With the advent of online social networks, there is an increasing demand for storage and processing of graph-structured data. Social networking applications pose new challenges to data management systems due to demand for real-time querying and manipulation of the graph structure. Recently, several systems specialized systems for graph-structured data have been introduced. However, whether we should abandon mature RDBMS technology for graph databases remains an ongoing discussion. In this paper we present an graph database benchmarking architecture built on the existing LDBC Social Network Benchmark. Our proposed architecture stresses the systems with an interactive transactional workload to better simulate the real-time nature of social networking applications. Using this improved architecture, we evaluated a selection of specialized graph databases, RDF stores, and RDBMSes adapted for graphs. We do not find that specialized graph databases provide definitively better performance.

30 citations


Journal ArticleDOI
TL;DR: This paper focuses on graph database Neo4j and its possibilities to express a database schema and ICs and extends these possibilities through new constructs in Neo4J DDL including their prototype implementation and experiments.

26 citations


Journal ArticleDOI
TL;DR: Experimental results show that the proposed approach is faster and more accurate than existing algorithms in discovering deadlock states in the most of case studies with large state spaces.

25 citations


Proceedings ArticleDOI
19 May 2017
TL;DR: An end-to-end graph analysis framework, called GraphGen, that sits atop an RDBMS, and supports graph querying/analytics through defining graphs as transformations over underlying relational datasets (as Graph-Views), and providing the ability to write arbitrary programs against the graphs.
Abstract: Graph querying and analytics are becoming an increasingly important component of the arsenal of tools for extracting different kinds of insights from data. Despite an immense amount of work on those topics, graphs are largely still handled in an ad hoc manner, in part because most data continues to reside in relational-like data management systems, and because graph analytics/querying typically forms a small portion of the overall analysis pipelines. In this paper we describe an end-to-end graph analysis framework, called GraphGen, that sits atop an RDBMS, and supports graph querying/analytics through: (a) defining graphs as transformations over underlying relational datasets (as Graph-Views) and (b) specifying queries or analytics on those graphs using either a high-level language or Java programs against a simple graph API. Although conceptually simple, GraphGen acts as an abstraction/independence layer that opens up many opportunities for adaptively optimizing graph analysis workflows, since the system can decide where to execute tasks on a per-task basis (in database or outside), how much of the graph to materialize in memory, and what types of in-memory representations to use (especially critical when the graphs are larger than the input datasets, as is often the case). At the same time, by providing the ability to write arbitrary programs against the graphs, GraphGen removes a major expressivity limitation of many existing graph analysis systems, which only support limited programming frameworks. We describe the GraphGen DSL, loosely based on Datalog, that includes both graph specification and in-line analysis capabilities. We then discuss many optimization challenges in building GraphGen, that we are currently working on addressing.

Proceedings ArticleDOI
09 May 2017
TL;DR: This tutorial reviews and summarizes the research thus far into HCI and graph querying in the database community, giving researchers a snapshot of the current state of the art in this topic, and future research directions.
Abstract: Querying graph databases has emerged as an important research problem for real-world applications that center on large graph data. Given the syntactic complexity of graph query languages (e.g., SPARQL, Cypher), visual graph query interfaces make it easy for non-expert users to query such graph data repositories. In this tutorial, we survey recent developments in the emerging area of visual graph querying paradigm that bridges traditional graph querying with human computer interaction (HCI). We discuss manual and data-driven visual graph query interfaces, various strategies and guidance for constructing graph queries visually, interleaving processing of graph queries and visual actions, and visual exploration of graph query results. In addition, the tutorial suggests open problems and new research directions. In summary, in this tutorial we review and summarize the research thus far into HCI and graph querying in the database community, giving researchers a snapshot of the current state of the art in this topic, and future research directions.

Proceedings Article
01 Jan 2017
TL;DR: It is shown that the relational database system outperforms Neo4j for analytical queries and that Neo4J is faster for queries that do not filter on specific edge types.
Abstract: Graph databases with a custom non-relational backend promote themselves to outperform relational databases in answering queries on large graphs. Recent empirical studies show that this claim is not always true. However, these studies focus only on pattern matching queries and neglect analytical queries used in practice such as shortest path, diameter, degree centrality or closeness centrality. In addition, there is no distinction between different types of pattern matching queries. In this paper, we introduce a set of analytical and pattern matching queries, and evaluate them in Neo4j and a market-leading commercial relational database system. We show that the relational database system outperforms Neo4j for our analytical queries and that Neo4j is faster for queries that do not filter on specific edge types.

Journal ArticleDOI
TL;DR: Experimental results show that the proposed improved policy does improve the drawback of conventional maximal number of forbidding First Bad Marking (FBM) problem technology and can be used in all kinds of nets.
Abstract: Flexible manufacturing systems exhibit a high degree of resource sharing Since the parts advancing through the system compete for a finite number of resources, a deadlock may occur Accordingly, m

Proceedings ArticleDOI
01 Jun 2017
TL;DR: This work proposes to bridge bidirectional value driven design between economic planning and technology implementation on the basis of the data graph, information graph and knowledge graph to improve system reliability and robustness by managing data and information reuse, redundancy as well as structure.
Abstract: Value-Driven Design enables rational decisions to be made in terms of the optimum business and technical solution at every level of engineering design by employing economics in decision making. In order to maximize the business profitability, we propose to bridge bidirectional value driven design between economic planning and technology implementation on the basis of the data graph, information graph and knowledge graph. We use data graph, information graph and knowledge graph to analyze problems that have negative impact on activities of software development including requirement analysis, summary design and detail design. We propose to improve system reliability and robustness by managing data and information reuse, redundancy as well as structure.

Proceedings ArticleDOI
28 Aug 2017
TL;DR: This paper shows how the proposed multidimensional (MD) data model for graph analysis was implemented over the widely used Neo4J graph database, discusses implementation issues, and presents a detailed case study to show how OLAP operations can be used on graphs.
Abstract: In current Big Data scenarios, traditional data warehousing and Online Analytical Processing (OLAP) operations on cubes are clearly not sufficient to address the current data analysis requirements. Nevertheless, OLAP operations and models can expand the possibilities of graph analysis beyond the traditional graph-based computation. In spite of this, there is not much work on the problem of taking OLAP analysis to the graph data model. In previous work we proposed a multidimensional (MD) data model for graph analysis, that considers not only the basic graph data, but background information in the form of dimension hierarchies as well. The graphs in our model are node- and edge-labelled directed multi-hypergraphs, called graphoids, defined at several different levels of granularity. In this paper we show how we implemented this proposal over the widely used Neo4J graph database, discuss implementation issues, and present a detailed case study to show how OLAP operations can be used on graphs.

Book ChapterDOI
24 Sep 2017
TL;DR: In this article, the authors propose a standard data model or query language for graph database systems, which are increasingly adapted for storing and processing heterogeneous network-like datasets, but due to the novelty of such systems, no standard data models or query languages have yet emerged, thus subjecting the users to the possibility of vendor lock-in.
Abstract: Graph database systems are increasingly adapted for storing and processing heterogeneous network-like datasets. However, due to the novelty of such systems, no standard data model or query language has yet emerged. Consequently, migrating datasets or applications even between related technologies often requires a large amount of manual work or ad-hoc solutions, thus subjecting the users to the possibility of vendor lock-in. To avoid this threat, vendors are working on supporting existing standard languages (e.g. SQL) or standardising languages.

Journal ArticleDOI
TL;DR: A new method for storing, modeling, and analyzing power grid data using the open source graph database Neo4j and the efficiency and effectiveness of topology modeling and analysis using graph database for a power grid network are introduced.
Abstract: We introduce a new method for storing, modeling, and analyzing power grid data. First, we present an architecture for building the network model for a power grid using the open source graph database Neo4j. Second, we design singleand multi-threading systems for initial energization analysis of the power grid network. We design the shortest path search function and conditional search function based on Neo4j. Finally, we compare the functionality and efficiency of our graph database with a traditional relational database in system initial energization analysis and the shortest path function problems on small to large data sets. The results demonstrate the efficiency and effectiveness of topology modeling and analysis using graph database for a power grid network.

Proceedings ArticleDOI
01 Sep 2017
TL;DR: In this paper, the authors present an approach and associated software for analyzing the performance and scalability of parallel, open-source graph libraries, such as GraphMat, Graph500, Graph Algorithm Platform Benchmark Suite, GraphBIG, and PowerGraph.
Abstract: The rapidly growing number of large network analysis problems has led to the emergence of many parallel and distributed graph processing systems—one survey in 2014 identified over 80 Determining the best approach for a given problem is infeasible for most developers We present an approach and associated software for analyzing the performance and scalability of parallel, open-source graph libraries We demonstrate our approach on five graph processing packages: GraphMat, Graph500, Graph Algorithm Platform Benchmark Suite, GraphBIG, and PowerGraph using synthetic and real-world datasets We examine previously overlooked aspects of parallel graph processing performance such as phases of execution and energy usage for three algorithms: breadth first search, single source shortest paths, and PageRank

Posted Content
TL;DR: GraphMP as mentioned in this paper proposes a vertex-centric sliding window (VSW) computation model to avoid reading and writing vertices on disk and a selective scheduling method to skip loading and processing unnecessary edge shards on disk.
Abstract: Recent studies showed that single-machine graph processing systems can be as highly competitive as cluster-based approaches on large-scale problems. While several out-of-core graph processing systems and computation models have been proposed, the high disk I/O overhead could significantly reduce performance in many practical cases. In this paper, we propose GraphMP to tackle big graph analytics on a single machine. GraphMP achieves low disk I/O overhead with three techniques. First, we design a vertex-centric sliding window (VSW) computation model to avoid reading and writing vertices on disk. Second, we propose a selective scheduling method to skip loading and processing unnecessary edge shards on disk. Third, we use a compressed edge cache mechanism to fully utilize the available memory of a machine to reduce the amount of disk accesses for edges. Extensive evaluations have shown that GraphMP could outperform state-of-the-art systems such as GraphChi, X-Stream and GridGraph by 31.6x, 54.5x and 23.1x respectively, when running popular graph applications on a billion-vertex graph.

Proceedings ArticleDOI
01 Sep 2017
TL;DR: DH-Falcon is presented, a graph DSL (domain-specific language) which can be used to implement parallel algorithms for large-scale graphs, tar-geting Distributed Heterogeneous (CPU and GPU) clusters and gains a speedup of up to 13×.
Abstract: Graph models of social information systems typically contain trillions of edges. Such big graphs cannot beprocessed on a single machine. The graph object must bepartitioned and distributed among machines and processedin parallel on a computer cluster. Programming such systemsis very challenging. In this work, we present DH-Falcon, a graph DSL (domain-specific language) which can be usedto implement parallel algorithms for large-scale graphs, tar-geting Distributed Heterogeneous (CPU and GPU) clusters. DH-Falcon compiler is built on top of the Falcon compiler, which targets single node devices with CPU and multipleGPUs. An important facility provided by DH-Falcon is that itsupports mutation of graph objects, which allows programmerto write dynamic graph algorithms. Experimental evaluationshows that DH-Falcon matches or outperforms state-of-the-art frameworks and gains a speedup of up to 13×.

Patent
06 Jan 2017
TL;DR: In this article, an un-optimized computational graph is analyzed using pattern matching to determine fusible operations that can be fused together into a single fusion operation, and the fusion node is translated as a call that performs the fused operations.
Abstract: Method comprising obtaining 202 un-optimised computational graph comprising a plurality of nodes representing operations and directed edges representing data dependencies; analysing 204 the un-optimised graph using pattern matching to determine fusable operations that can be fused together into a single fusion operation; transforming 206 the un-optimised graph into an optimised computational graph by replacing the nodes representing the fusible operations in the un-optimised graph with a fusion node representing the single fusion operation; and providing to a compiler the fusion node that the compiler can translate as a call that performs the fused operations to produce 208 efficient code. It may also provide the efficient code to computing devices for execution. Execution may include executing the operations of the computational graph including the single fusion call that performs all fused operations. Analysing using pattern matching may involve comparing portions of the un-optimised graph with patterns of operations that each correspond to a single fusion operation; determining that a pattern matches a portion of the un-optimised graph; and determining that the matching portion of the un-optimised graph can be replaced in the computational graph with the single fusion operation corresponding to the matching pattern.

Journal ArticleDOI
TL;DR: This paper presents a large-graph visualization system called ModuleGraph, a scalable representation of graph structures by treating a graph as a set of modules that can efficiently support large-scale social and spatial network visualization.
Abstract: The efficient visualization of dynamic network structures has become a dominant problem in many big data applications, such as large network analytics, traffic management, resource allocation graphs, logistics, social networks, and large document repositories. In this paper, we present a large-graph visualization system called ModuleGraph. ModuleGraph is a scalable representation of graph structures by treating a graph as a set of modules. The main objectives are: (1) to detect graph patterns in the visualization of large-graph data, and (2) to emphasize the interconnecting structures to detect potential interactions between local modules. Our first contribution is a hybrid modularity measure. This measure partitions the cohesion of the graph at various levels of details. We aggregate clusters of nodes and edges into several modules to reduce the overlap between graph components on a 2D display. Our second contribution is a k-clustering method that can flexibly detect the local patterns or substructures in modules. Patterns of modules are preserved by the ModuleGraph system to avoid information loss, while sub-graphs are clustered as a single node. Our experiments show that this method can efficiently support large-scale social and spatial network visualization. Graphical Abstract text

Book ChapterDOI
12 Jun 2017
TL;DR: A multi-core scheduling approach for a model presenting Mixed-criticality tasks and their dependencies as a Directed Acyclic Graph (DAG) is proposed and an evaluation framework is introduced, released as an open source software.
Abstract: Deploying safety-critical systems into constrained embedded platforms is a challenge for developers who must arbitrate between two conflicting objectives: software has to be safe and resources need to be used efficiently. Mixed-criticality (MC) has been proposed to meet a trade-off between these two aspects. Nonetheless, most task models considered in the literature of MC scheduling, do not take into account precedence constraints among tasks. In this paper, we propose a multi-core scheduling approach for a model presenting MC tasks and their dependencies as a Directed Acyclic Graph (DAG). We also introduce an evaluation framework for this model, released as an open source software. Evaluation of our scheduling algorithm provides evidence of the difficulty to find correct scheduling for DAGs of MC tasks. Besides, experimentation results provided in this paper show that our scheduling algorithm outperforms existing algorithms for scheduling DAGs of MC tasks.

Journal ArticleDOI
23 May 2017
TL;DR: This paper presents optimization techniques used in relational database systems and applies them on graph queries and evaluates various query plans on multiple datasets and discusses the effect of different optimization techniques.
Abstract: The last decade brought considerable improvements in dis - tributed storage and query technologies, known as NoSQL systems These systems provide quick evaluation of simple retrieval operations and are able to answer certain complex queries in a scalable way, albeit not instantly Providing scal - ability and quick response times at the same time for querying large data sets is still a challenging task Evaluating com - plex graph queries is particularly difficult, as it requires lots of join, antijoin and filtering operations This paper presents optimization techniques used in relational database systems and applies them on graph queries We evaluate various query plans on multiple datasets and discuss the effect of different optimization techniques

Proceedings ArticleDOI
09 May 2017
TL;DR: This tutorial will provide a generalized definition of graph exploration in which the user interacts directly with the system either providing feedback or a partial query, and discuss common, diverse, and missing properties ofgraph exploration techniques based on this definition, the authors' taxonomy, and multiple applications for graph exploration.
Abstract: The increasing interest in social networks, knowledge graphs, protein-interaction, and many other types of networks has raised the question how users can explore such large and complex graph structures easily. Current tools focus on graph management, graph mining, or graph visualization but lack user-driven methods for graph exploration. In many cases graph methods try to scale to the size and complexity of a real network. However, methods miss user requirements such as exploratory graph query processing, intuitive graph explanation, and interactivity in graph exploration. While there is consensus in database and data mining communities on the definition of data exploration practices for relational and semi-structured data, graph exploration practices are still indeterminate. In this tutorial, we will discuss a set of techniques, which have been developed in the last few years for independent purposes, within a unified graph exploration taxonomy. The tutorial will provide a generalized definition of graph exploration in which the user interacts directly with the system either providing feedback or a partial query. We will discuss common, diverse, and missing properties of graph exploration techniques based on this definition, our taxonomy, and multiple applications for graph exploration. Concluding this discussion we will highlight interesting and relevant challenges for data scientists in graph exploration.

Journal ArticleDOI
TL;DR: This research illustrates potential of graph structure for diversified modeling along with its convenience for various domains and provides guidelines for solving data modeling challenges for structure, semi structure and unstructured data.
Abstract: Graph is considered as next frontier in the era of Big data due to its flexibility and self-explaining property. The prime objective of this research is to reveal graph database as an alternative of traditional relation database in the field of database. This research illustrates potential of graph structure for diversified modeling along with its convenience for various domains. Extensive literature review demonstrates use of diversified graph structures as means of data storage and analysis as it can cope up any kind of complex structures ranging from multi linked web data, complex chemical structure, gene data, network structure, social network, e-commerce to text data. A formal conclusion of this review revealed use of various graph models according to state-of-affairs of various domains as well as data modeling challenges and complexity of Big data. This investigation anticipates use of an appropriate graph structure and provides guidelines for solving data modeling challenges for structure, semi structure and unstructured data. The diversified graph structure along with its characteristics has been suggested for real world problems of various domains. This research could lend a helping hand to anyone who wants to implement graph data model for their data management challenges and computational problems.

Proceedings ArticleDOI
08 Aug 2017
TL;DR: This work reports on efforts to develop verification methods for OSEK-conformant compilers, specifically of a code generator that weaves system calls and application code using a static configuration file, producing a stand-alone application that incorporates the relevant parts of the kernel.
Abstract: The OSEK industrial standard governs the design of embedded real-time operating systems in the automotive domain. We report on efforts to develop verification methods for OSEK-conformant compilers, specifically of a code generator that weaves system calls and application code using a static configuration file, producing a stand-alone application that incorporates the relevant parts of the kernel. Our methodology involves two verification steps: On the one hand, we extract an OS-application interaction graph during the compilation phase and verify that it conforms to the standard, in particular regarding prioritized scheduling and interrupt handling. To this end, we generate from the configuration file a temporal specification of standard-conformant behaviour and model check the arising formulas on a labelled transition system extracted from the interaction graph. On the other hand, we verify that the actual generated code conforms to the interaction graph; this is done by graph isomorphism checking of the interaction graph against a dynamically-explored state-transition graph of the generated system.

Proceedings ArticleDOI
01 Aug 2017
TL;DR: GeaBase is a new distributed graph database that provides the capability to store and analyze graph-structured data in real-time at massive scale, including a novel update architecture, called Update Center (UC), and a new language that is suitable for both graph traversal and analytics.
Abstract: Graph analytics have been gaining tractions rapidly in the past few years. It has a wide array of application areas in the industry, ranging from e-commerce, social network and recommendation systems to fraud detection and virtually any problem that requires insights into data connections, not just data itself. In this paper, we present GeaBase, a new distributed graph database that provides the capability to store and analyze graph-structured data in real-time at massive scale. We describe the details of the system and the implementation, including a novel update architecture, called Update Center (UC), and a new language that is suitable for both graph traversal and analytics. We also compare the performance of GeaBase to a widely used open-source graph database Titan. Experiments show that GeaBase is up to 182x faster than Titan in our testing scenarios. We also achieves 22x higher throughput on social network workloads in the comparison.

Journal ArticleDOI
TL;DR: This paper provides the definitions of the summaries with the methods to automatically extract them from NoSQL graph databases only and with the help of in-memory architectures and demonstrates the benefit of the proposition by experimental results.
Abstract: NoSQL graph databases have been introduced in recent years for dealing with large collections of graph-based data. Scientific data and social networks are among the best examples of the dramatic increase of the use of such structures. NoSQL repositories allow the management of large amounts of data in order to store and query them. Such data are not structured with a predefined schema as relational databases could be. They are rather composed by nodes and relationships of a certain type. For instance, a node can represent a Person and a relationship Friendship. Retrieving the structure of the graph database is thus of great help to users, for example when they must know how to query the data or to identify relevant data sources for recommender systems. For this reason, this paper introduces methods to retrieve structural summaries. Such structural summaries are extracted at different levels of information from the NoSQL graph database. The expression of the mining queries is facilitated by the use of two frame-works: Fuzzy4S allowing to define fuzzy operators and operations with Scala; Cypherf allowing the use of fuzzy operators and operations in the declarative queries over NoSQL graph databases. We show that extracting such summaries can be impossible with the NoSQL query engines because of the data volume and the complexity of the task of automatic knowledge extraction. A novel method based on in memory architectures is thus introduced. This paper provides the definitions of the summaries with the methods to automatically extract them from NoSQL graph databases only and with the help of in-memory architectures. The benefit of our proposition is demonstrated by experimental results.