TL;DR: A novel multicore parallel algorithm for computing the frequency of small subgraphs on a large network using a state-of-the-art data structure, the g-trie, which allows for a very efficient sequential search and paves the way for the usage of such counting algorithms on larger subgraph and network sizes without the obligatory access to a cluster.
Abstract: Computing the frequency of small subgraphs on a large network is a computationally hard task. This is, however, an important graph mining primitive, with several applications, and here we present a novel multicore parallel algorithm for this task. At the core of our methodology lies a state-of-the-art data structure, the g-trie, which represents a collection of subgraphs and allows for a very efficient sequential search. Our implementation was done using Pthreads and can run on any multicore personal computer. We employ a diagonal work sharing strategy to dynamically and effectively divide work among threads during the execution. We assess the performance of our Pthreads implementation on a set of representative networks from various domains and with diverse topological features. For most networks, we obtain a speedup of over 50 for 64 cores and an almost linear speedup up to 32 cores, showcasing the flexibility and scalability of our algorithm. This paves the way for the usage of such counting algorithms on larger subgraph and network sizes without the obligatory access to a cluster.
A typical motif discovery algorithm will need to count all subgraphs of a certain size both in the original network and in an ensemble of similar randomized networks [5].
Multicore architectures are, however, much more common and readily available to a typical practitioner, with multicores being pervasive even on personal computers.
The authors main contribution in this paper is precisely a novel parallel algorithm for subgraph counting geared towards multicores.
Section II formalizes the problem being tackled and talks about related work.
A. Problem Definition
Two occurrences are considered different if they have at least one node or edge that they do not share.
Figure 1 gives an example of a subgraph frequency computation, detailing the subgraph occurrences found (these are given as sets of nodes).
Note also how the authors distinguish occurrences: other possible frequency concepts do exist [10], but here they resort to the standard definition.
B. Related Work
Sequential subgraph counting algorithms can be divided into three different conceptual approaches.
By contrast, subgraph-centric methods, such as the one by Grochow and Kellis [14], only search for one subgraph type at a time, individually computing their frequency.
These algorithms provide exact results, and here the authors will also concentrate on exact frequency computation, but they should note that there exist some sampling alternatives for providing approximate results.
A different parallel approach is the one by Wang et al. [17], which employs a static pre-division of work and limits the analysis to a single network and a fixed number of cores (32).
A. The G-Trie Data Structure
Instead of storing strings and identifying common prefixes, it stores subgraphs and identifies common subtopologies.
Like a classical string trie, it is a multiway tree, and each tree node contains information about a single subgraph vertex and its connections to the vertices stored in ancestor tree nodes.
The same concept can be easily applied to directed subgraphs by also storing the direction of each connection.
This capability is the main strength of a g-trie, not only because the authors compress the information (avoiding redundant storage), but also because, when they are matching a specific node in the g-trie, they are, at the same time, matching all possible descendant subgraphs stored in the g-trie.
Given the space constraints, the authors refer the reader to [9] for a detailed explanation of how a g-trie can be created.
B. Subgraph Counting with a G-Trie
The algorithm depicted in Figure 3 details how the authors can use g-tries for counting subgraphs sequentially.
The authors use the information stored in the g-trie to heavily constrain the search.
Essentially, from the current partial match, the authors look for the vertex that is both connected, in the current g-trie node, to the vertex being added and, at the same time, has the smallest number of neighbors in the network, which are the potential candidates for that position (lines 14 and 15).
For the sake of illustration, the authors will now exemplify how one occurrence is found.
IV. PARALLEL G-TRIE ALGORITHM
One of the most important aspects of their sequential algorithm is that it originates completely independent search tree branches.
In fact, each call to count(T, Vused) produces one different branch, and knowing the gtrie node T and the already matched vertices Vused is enough for continuing the search from that point.
The problem with this static strategy is that the generated search tree is highly irregular and unbalanced.
To achieve a scalable approach, an efficient dynamic sharing mechanism, that redistributes work during execution time, is required.
In their parallel approach the authors keep this crucial feature of the algorithm and do not artificially introduce explicit queues during the normal execution of the algorithm.
A. Overall View
The authors allocate one thread per core, with each thread being initially assigned an equal amount of vertices.
When a thread P finishes its allotted computation, it requests new work from another active thread Q, which responds by first stopping its computation.
Both threads then resume their execution, starting at the bottom (meaning the lowest levels of the g-trie) of their respective work trees.
The execution starts at the bottom so that only one Vused is necessary, taking advantage of the common subtopology of ancestor and descendant nodes in the same path.
The authors will now describe in more detail the various components of their algorithm.
B. Parallel Subgraph Frequency Counting
Figure 4 depicts their parallel counting algorithm.
At each step, the thread computes the vertex threadid positions after the previous one (line 13).
The authors do this in a round-robin fashion because it generally provides a more equitable initial division than 1The authors implementation, along with test data, can be consulted on the following URL: http://www.dcc.fc.up.pt/gtries/.
The authors intuition was verified empirically by observing that the threads would ask for work sooner if continuous intervals were used.
Initially, the authors kept in each g-trie node a shared array Fr[1..numthreads] where the threads would update the array at the position of their threadid.
E. Work Resuming
After the threads have shared work, they resume it and proceed with the computation.
If the thread receives a work request, work sharing is performed (line 7).
After work sharing is performed (lines 8 and 9), the thread continues its computation with the new work tree (line 10) and the current execution is discarded (line 11).
The thread first checks if it has arrived at a desired subgraph (line 12) and increases its frequency in that case (line 13).
Otherwise, the thread calls parallelCount with the new vertex added to Vused for each children of the g-trie node (lines 15 and 16).
V. RESULTS
The authors experimental results were gathered on a 64-core machine.
In Table II the authors show the size of the subgraphs and the resulting number of all possible subgraphs of that type and size that will be counted in that network.
The sequential time and the obtained speedups for 8, 16, 32 and 64 cores are shown in Tables III and IV.
Nevertheless, pbzip had a performance similar to their algorithm, with near-linear speedup up to 32 cores and with a speedup of around 50 for 64 cores, further substantiating the idea that, with a different architecture, their algorithm could still present near-linear speedup with more than 32 cores.
The authors can also observe that as the network size increases, the performance slightly degrades.
VI. CONCLUSION
In this paper the authors presented a scalable algorithm to count subgraph frequencies for multicore architectures.
The sequential version already performed significantly better than competing algorithms, making it a solid base for improvement.
To the best of their knowledge, their parallel algorithm is the fastest available method for shared memory environments and allows practitioners to take advantage of either their personal multicore machines or more dedicated computing resources.
The authors also intend to explore several variations on the g-tries algorithm, like, for instance, using different base graph data-structures or using sampling to obtain approximate results.
Finally, to give their work a more practical context, the authors will use their implementation in real world scenarios.
TL;DR: Arabesque is presented, the first distributed data processing platform for implementing graph mining algorithms that automates the process of exploring a very large number of subgraphs and defines a high-level filter-process computational model that simplifies the development of scalableGraph mining algorithms.
Abstract: Distributed data processing platforms such as MapReduce and Pregel have substantially simplified the design and deployment of certain classes of distributed graph analytics algorithms. However, these platforms do not represent a good match for distributed graph mining problems, as for example finding frequent subgraphs in a graph. Given an input graph, these problems require exploring a very large number of subgraphs and finding patterns that match some "interestingness" criteria desired by the user. These algorithms are very important for areas such as social networks, semantic web, and bioinformatics.In this paper, we present Arabesque, the first distributed data processing platform for implementing graph mining algorithms. Arabesque automates the process of exploring a very large number of subgraphs. It defines a high-level filter-process computational model that simplifies the development of scalable graph mining algorithms: Arabesque explores subgraphs and passes them to the application, which must simply compute outputs and decide whether the subgraph should be further extended. We use Arabesque's API to produce distributed solutions to three fundamental graph mining problems: frequent subgraph mining, counting motifs, and finding cliques. Our implementations require a handful of lines of code, scale to trillions of subgraphs, and represent in some cases the first available distributed solutions.
208 citations
Cites methods from "Parallel Subgraph Counting for Mult..."
...For motifs, [29] proposes a multicore parallel approach, while [34] develops methods for approximate motif counting on a tightly coupled HPC system using MPI....
TL;DR: RStream is the first single-machine, out-of-core mining system that leverages disk support to store intermediate data and demonstrates that RStream outperforms all of them, running on a 10-node cluster, e.g., by at least a factor of 1.7×, and can process large graphs on an inexpensive machine.
Abstract: Graph mining is an important category of graph algorithms that aim to discover structural patterns such as cliques and motifs in a graph. While a great deal of work has been done recently on graph computation such as PageRank, systems support for scalable graph mining is still limited. Existing mining systems such as Arabesque focus on distributed computing and need large amounts of compute and memory resources.We built RStream, the first single-machine, out-of-core mining system that leverages disk support to store intermediate data. At its core are two innovations: (1) a rich programming model that exposes relational algebra for developers to express a wide variety of mining tasks; and (2) a runtime engine that implements relational algebra efficiently with tuple streaming. A comparison between RStream and four state-of-the-art distributed mining/Datalog systems--Arabesque, ScaleMine, DistGraph, and BigDatalog -- demonstrates that RStream outperforms all of them, running on a 10-node cluster, e.g., by at least a factor of 1.7×, and can process large graphs on an inexpensive machine.
81 citations
Cites background from "Parallel Subgraph Counting for Mult..."
...Recently, a body of algorithms have been developed to leverage parallel [28, 12, 59, 64], distributed systems (such as Map/Reduce) [35, 19, 41, 44, 71, 6, 36, 82, 18], or GPUs [37]....
TL;DR: An unbiased graphlet estimation framework that is fast with large speedups compared to the state of the art; parallel with nearly linear speedups; accurate with less than 1% relative error; scalable and space efficient for massive networks with billions of edges; and effective for a variety of real-world settings.
Abstract: Graphlets are induced subgraphs of a large network and are important for understanding and modeling complex networks. Despite their practical importance, graphlets have been severely limited to applications and domains with relatively small graphs. Most previous work has focused on exact algorithms ; however, it is often too expensive to compute graphlets exactly in massive networks with billions of edges, and finding an approximate count is usually sufficient for many applications. In this paper, we propose an unbiased graphlet estimation framework that is: (a) fast with large speedups compared to the state of the art; (b) parallel with nearly linear speedups; (c) accurate with less than 1% relative error; (d) scalable and space efficient for massive networks with billions of edges; and (e) effective for a variety of real-world settings as well as estimating global and local graphlet statistics (e.g., counts). On 300 networks from 20 domains, we obtain <1% relative error for all graphlets. This is vastly more accurate than the existing methods while using less data. Moreover, it takes a few seconds on billion edge graphs (as opposed to days/weeks). These are by far the largest graphlet computations to date.
37 citations
Cites background from "Parallel Subgraph Counting for Mult..."
...As an aside, there have been a few distributed memory [59] and shared memory [60], [61] exact algorithms....
TL;DR: This survey aims to provide a comprehensive overview of the existing methods for subgraph counting, identifying and describing the main conceptual approaches, giving insight on their advantages and limitations, and providing pointers to existing implementations.
Abstract: Computing subgraph frequencies is a fundamental task that lies at the core of several network analysis methodologies, such as network motifs and graphlet-based metrics, which have been widely used to categorize and compare networks from multiple domains. Counting subgraphs is however computationally very expensive and there has been a large body of work on efficient algorithms and strategies to make subgraph counting feasible for larger subgraphs and networks.
This survey aims precisely to provide a comprehensive overview of the existing methods for subgraph counting. Our main contribution is a general and structured review of existing algorithms, classifying them on a set of key characteristics, highlighting their main similarities and differences. We identify and describe the main conceptual approaches, giving insight on their advantages and limitations, and provide pointers to existing implementations. We initially focus on exact sequential algorithms, but we also do a thorough survey on approximate methodologies (with a trade-off between accuracy and execution time) and parallel strategies (that need to deal with an unbalanced search space).
37 citations
Cites background or methods from "Parallel Subgraph Counting for Mult..."
...SM-Gtries [12] 2014 SM Vertices Subgraph-trees DFS Diagonal W-W [145]...
[...]
...In this strategy, an idle worker asks a random worker for work [10, 12]....
[...]
...This strategy achieves a balanced work-division during runtime, and the penalty caused by worker communication is negligible [10, 12]....
[...]
...Algorithms that employ this strategy [10, 12, 151, 152] perform an initial static work division....
[...]
...To avoid the cost of synchronization and of storing partial results, most subgraph counting algorithms traverse the search space in a depth-first fashion [3, 10, 12, 56, 151– 153, 172, 190]....
TL;DR: This paper presents the first efficient distributed implementation for color coding that goes beyond tree queries, and applies to any query graph of treewidth 2, which is the first step into the realm of color coding for queries that require superlinear worst case running time.
Abstract: The problem of counting occurrences of query graphs in a large data graph, known as subgraph counting, is fundamental to several domains such as genomics and social network analysis. Many important special cases (e.g. triangle counting) have received significant attention. Color coding is a very general and powerful algorithmic technique for subgraph counting. Color coding has been shown to be effective in several applications, but scalable implementations are only known for the special case of {\em tree queries} (i.e. queries of treewidth one).
In this paper we present the first efficient distributed implementation for color coding that goes beyond tree queries: our algorithm applies to any query graph of treewidth $2$. Since tree queries can be solved in time linear in the size of the data graph, our contribution is the first step into the realm of colour coding for queries that require superlinear running time in the worst case. This superlinear complexity leads to significant load balancing problems on graphs with heavy tailed degree distributions. Our algorithm structures the computation to work around high degree nodes in the data graph, and achieves very good runtime and scalability on a diverse collection of data and query graph pairs as a result. We also provide theoretical analysis of our algorithmic techniques, showing asymptotic improvements in runtime on random graphs with power law degree distributions, a popular model for real world graphs.
27 citations
Cites methods from "Parallel Subgraph Counting for Mult..."
...Based on the above intuition, we apply dynamic programming to count the number
colorful matches of Q....
TL;DR: Network motifs, patterns of interconnections occurring in complex networks at numbers that are significantly higher than those in randomized networks, are defined and may define universal classes of networks.
Abstract: Complex networks are studied across many fields of science. To uncover their structural design principles, we defined “network motifs,” patterns of interconnections occurring in complex networks at numbers that are significantly higher than those in randomized networks. We found such motifs in networks from biochemistry, neurobiology, ecology, and engineering. The motifs shared by ecological food webs were distinct from the motifs shared by the genetic networks of Escherichia coli and Saccharomyces cerevisiae or from those found in the World Wide Web. Similar motifs were found in networks that perform information processing, even though they describe elements as different as biomolecules within a cell and synaptic connections between neurons in Caenorhabditis elegans. Motifs may thus define universal classes of networks. This
6,992 citations
"Parallel Subgraph Counting for Mult..." refers background in this paper
...This frequency computation lies at the core of several graph metrics, such as graphlet degree distributions [3] or network motifs [4]....
TL;DR: It is shown that any recognition problem solved by a polynomial time-bounded nondeterministic Turing machine can be “reduced” to the problem of determining whether a given propositional formula is a tautology.
Abstract: It is shown that any recognition problem solved by a polynomial time-bounded nondeterministic Turing machine can be “reduced” to the problem of determining whether a given propositional formula is a tautology. Here “reduced” means, roughly speaking, that the first problem can be solved deterministically in polynomial time provided an oracle is available for solving the second. From this notion of reducible, polynomial degrees of difficulty are defined, and it is shown that the problem of determining tautologyhood has the same polynomial degree as the problem of determining whether the first of two given graphs is isomorphic to a subgraph of the second. Other examples are discussed. A method of measuring the complexity of proof procedures for the predicate calculus is introduced and discussed.
6,675 citations
"Parallel Subgraph Counting for Mult..." refers background in this paper
...computationally hard task, closely related to subgraph isomorphism, which is one of the classical NP-complete problems [6]....
TL;DR: A modularity matrix plays a role in community detection similar to that played by the graph Laplacian in graph partitioning calculations, and a spectral measure of bipartite structure in networks and a centrality measure that identifies vertices that occupy central positions within the communities to which they belong are proposed.
Abstract: We consider the problem of detecting communities or modules in networks, groups of vertices with a higher-than-average density of edges connecting them. Previous work indicates that a robust approach to this problem is the maximization of the benefit function known as ``modularity'' over possible divisions of a network. Here we show that this maximization process can be written in terms of the eigenspectrum of a matrix we call the modularity matrix, which plays a role in community detection similar to that played by the graph Laplacian in graph partitioning calculations. This result leads us to a number of possible algorithms for detecting community structure, as well as several other results, including a spectral measure of bipartite structure in networks and a centrality measure that identifies vertices that occupy central positions within the communities to which they belong. The algorithms and measures proposed are illustrated with applications to a variety of real-world complex networks.
4,559 citations
"Parallel Subgraph Counting for Mult..." refers background in this paper
...73 No Coauthorships of scientists working on network experiments [24] Newman(1)...
TL;DR: Differences in the behavior of liberal and conservative blogs are found, with conservative blogs linking to each other more frequently and in a denser pattern.
Abstract: In this paper, we study the linking patterns and discussion topics of political bloggers. Our aim is to measure the degree of interaction between liberal and conservative blogs, and to uncover any differences in the structure of the two communities. Specifically, we analyze the posts of 40 "A-list" blogs over the period of two months preceding the U.S. Presidential Election of 2004, to study how often they referred to one another and to quantify the overlap in the topics they discussed, both within the liberal and conservative communities, and also across communities. We also study a single day snapshot of over 1,000 political blogs. This snapshot captures blogrolls (the list of links to other blogs frequently found in sidebars), and presents a more static picture of a broader blogosphere. Most significantly, we find differences in the behavior of liberal and conservative blogs, with conservative blogs linking to each other more frequently and in a denser pattern.
2,800 citations
"Parallel Subgraph Counting for Mult..." refers background in this paper
...76 Yes Network of hyperlinks between weblogs on US politics [23] Newman(1) netsc 1,589 2,742 1....
TL;DR: A new graph generator is provided, based on a "forest fire" spreading process, that has a simple, intuitive justification, requires very few parameters (like the "flammability" of nodes), and produces graphs exhibiting the full range of properties observed both in prior work and in the present study.
Abstract: How do real graphs evolve over time? What are "normal" growth patterns in social, technological, and information networks? Many studies have discovered patterns in static graphs, identifying properties in a single snapshot of a large network, or in a very small number of snapshots; these include heavy tails for in- and out-degree distributions, communities, small-world phenomena, and others. However, given the lack of information about network evolution over long periods, it has been hard to convert these findings into statements about trends over time.Here we study a wide range of real graphs, and we observe some surprising phenomena. First, most of these graphs densify over time, with the number of edges growing super-linearly in the number of nodes. Second, the average distance between nodes often shrinks over time, in contrast to the conventional wisdom that such distance parameters should increase slowly as a function of the number of nodes (like O(log n) or O(log(log n)).Existing graph generation models do not exhibit these types of behavior, even at a qualitative level. We provide a new graph generator, based on a "forest fire" spreading process, that has a simple, intuitive justification, requires very few parameters (like the "flammability" of nodes), and produces graphs exhibiting the full range of properties observed both in prior work and in the present study.
2,548 citations
"Parallel Subgraph Counting for Mult..." refers background in this paper
...94 No Traffic flows between routers [26] SNAP(2) company 8,497 6,724 0....
Q1. What are the future works in "Parallel subgraph counting for multicore architectures" ?
For example, the authors are in the process of building a large co-authorship network and plan to explore its structure using their algorithm.
Q2. What contributions have the authors mentioned in the paper "Parallel subgraph counting for multicore architectures" ?
This is, however, an important graph mining primitive, with several applications, and here the authors present a novel multicore parallel algorithm for this task. The authors assess the performance of their Pthreads implementation on a set of representative networks from various domains and with diverse topological features. For most networks, the authors obtain a speedup of over 50 for 64 cores and an almost linear speedup up to 32 cores, showcasing the flexibility and scalability of their algorithm. This paves the way for the usage of such counting algorithms on larger subgraph and network sizes without the obligatory access to a cluster.