scispace - formally typeset
Search or ask a question

Showing papers on "Complement graph published in 2013"


01 Jan 2013

801 citations


Proceedings ArticleDOI
13 May 2013
TL;DR: This work proposes a novel factorization technique that relies on partitioning a graph so as to minimize the number of neighboring vertices rather than edges across partitions, and decomposition is based on a streaming algorithm.
Abstract: Natural graphs, such as social networks, email graphs, or instant messaging patterns, have become pervasive through the internet. These graphs are massive, often containing hundreds of millions of nodes and billions of edges. While some theoretical models have been proposed to study such graphs, their analysis is still difficult due to the scale and nature of the data. We propose a framework for large-scale graph decomposition and inference. To resolve the scale, our framework is distributed so that the data are partitioned over a shared-nothing set of machines. We propose a novel factorization technique that relies on partitioning a graph so as to minimize the number of neighboring vertices rather than edges across partitions. Our decomposition is based on a streaming algorithm. It is network-aware as it adapts to the network topology of the underlying computational hardware. We use local copies of the variables and an efficient asynchronous communication protocol to synchronize the replicated values in order to perform most of the computation without having to incur the cost of network communication. On a graph of 200 million vertices and 10 billion edges, derived from an email communication network, our algorithm retains convergence properties while allowing for almost linear scalability in the number of computers.

655 citations


Book ChapterDOI
03 Mar 2013
TL;DR: A generic, efficient reduction is derived that allows us to apply any differentially private algorithm for bounded-degree graphs to an arbitrary graph, based on analyzing the smooth sensitivity of the 'naive' truncation that simply discards nodes of high degree.
Abstract: We develop algorithms for the private analysis of network data that provide accurate analysis of realistic networks while satisfying stronger privacy guarantees than those of previous work. We present several techniques for designing node differentially private algorithms, that is, algorithms whose output distribution does not change significantly when a node and all its adjacent edges are added to a graph. We also develop methodology for analyzing the accuracy of such algorithms on realistic networks. The main idea behind our techniques is to 'project' (in one of several senses) the input graph onto the set of graphs with maximum degree below a certain threshold. We design projection operators, tailored to specific statistics that have low sensitivity and preserve information about the original statistic. These operators can be viewed as giving a fractional (low-degree) graph that is a solution to an optimization problem described as a maximum flow instance, linear program, or convex program. In addition, we derive a generic, efficient reduction that allows us to apply any differentially private algorithm for bounded-degree graphs to an arbitrary graph. This reduction is based on analyzing the smooth sensitivity of the 'naive' truncation that simply discards nodes of high degree.

321 citations


Proceedings ArticleDOI
07 Oct 2013
TL;DR: A very simple percolation - based graph matching algorithm that incrementally maps every pair of nodes (i,j) with at least r neighboring mapped pairs is proposed and analyzed, which makes possible a rigorous analysis that relies on recent advances in bootstrappercolation theory for the G(n,p) random graph.
Abstract: Graph matching is a generalization of the classic graph isomorphism problem. By using only their structures a graph-matching algorithm finds a map between the vertex sets of two similar graphs. This has applications in the de-anonymization of social and information networks and, more generally, in the merging of structural data from different domains.One class of graph-matching algorithms starts with a known seed set of matched node pairs. Despite the success of these algorithms in practical applications, their performance has been observed to be very sensitive to the size of the seed set. The lack of a rigorous understanding of parameters and performance makes it difficult to design systems and predict their behavior.In this paper, we propose and analyze a very simple percolation - based graph matching algorithm that incrementally maps every pair of nodes (i,j) with at least r neighboring mapped pairs. The simplicity of this algorithm makes possible a rigorous analysis that relies on recent advances in bootstrap percolation theory for the G(n,p) random graph. We prove conditions on the model parameters in which percolation graph matching succeeds, and we establish a phase transition in the size of the seed set. We also confirm through experiments that the performance of percolation graph matching is surprisingly good, both for synthetic graphs and real social-network data.

177 citations


Journal ArticleDOI
TL;DR: This paper considers different classes of graphs that are roughly differentiated considering the complexity of the defined labels for both vertices and edges, aiming at explaining some significant instances of each graph matching methodology mainly considered in the technical literature.
Abstract: In this paper, we propose a survey concerning the state of the art of the graph matching problem, conceived as the most important element in the definition of inductive inference engines in graph-based pattern recognition applications. We review both methodological and algorithmic results, focusing on inexact graph matching procedures. We consider different classes of graphs that are roughly differentiated considering the complexity of the defined labels for both vertices and edges. Emphasis will be given to the understanding of the underlying methodological aspects of each identified research branch. A selection of inexact graph matching algorithms is proposed and synthetically described, aiming at explaining some significant instances of each graph matching methodology mainly considered in the technical literature.

173 citations


Posted Content
TL;DR: This survey discusses both classical text-book type properties and some advanced properties of graph sampling, and provides a taxonomy of different graph sampling objectives and graph sampling approaches.
Abstract: Graph sampling is a technique to pick a subset of vertices and/ or edges from original graph. It has a wide spectrum of applications, e.g. survey hidden population in sociology [54], visualize social graph [29], scale down Internet AS graph [27], graph sparsification [8], etc. In some scenarios, the whole graph is known and the purpose of sampling is to obtain a smaller graph. In other scenarios, the graph is unknown and sampling is regarded as a way to explore the graph. Commonly used techniques are Vertex Sampling, Edge Sampling and Traversal Based Sampling. We provide a taxonomy of different graph sampling objectives and graph sampling approaches. The relations between these approaches are formally argued and a general framework to bridge theoretical analysis and practical implementation is provided. Although being smaller in size, sampled graphs may be similar to original graphs in some way. We are particularly interested in what graph properties are preserved given a sampling procedure. If some properties are preserved, we can estimate them on the sampled graphs, which gives a way to construct efficient estimators. If one algorithm relies on the perserved properties, we can expect that it gives similar output on original and sampled graphs. This leads to a systematic way to accelerate a class of graph algorithms. In this survey, we discuss both classical text-book type properties and some advanced properties. The landscape is tabularized and we see a lot of missing works in this field. Some theoretical studies are collected in this survey and simple extensions are made. Most previous numerical evaluation works come in an ad hoc fashion, i.e. evaluate different type of graphs, different set of properties, and different sampling algorithms. A systematical and neutral evaluation is needed to shed light on further graph sampling studies.

150 citations


Journal ArticleDOI
01 Jan 2013
TL;DR: This paper proposes NeMa (Network Match), a neighborhood-based sub graph matching technique for querying real-life networks and proposes a novel subgraph matching cost metric that aggregates the costs of matching individual nodes, and unifies both structure and node label similarities.
Abstract: It is increasingly common to find real-life data represented as networks of labeled, heterogeneous entities. To query these networks, one often needs to identify the matches of a given query graph in a (typically large) network modeled as a target graph. Due to noise and the lack of fixed schema in the target graph, the query graph can substantially differ from its matches in the target graph in both structure and node labels, thus bringing challenges to the graph querying tasks. In this paper, we propose NeMa (Network Match), a neighborhood-based subgraph matching technique for querying real-life networks. (1) To measure the quality of the match, we propose a novel subgraph matching cost metric that aggregates the costs of matching individual nodes, and unifies both structure and node label similarities. (2) Based on the metric, we formulate the minimum cost subgraph matching problem. Given a query graph and a target graph, the problem is to identify the (top-k) matches of the query graph with minimum costs in the target graph. We show that the problem is NP-hard, and also hard to approximate. (3) We propose a heuristic algorithm for solving the problem based on an inference model. In addition, we propose optimization techniques to improve the efficiency of our method. (4) We empirically verify that NeMa is both effective and efficient compared to the keyword search and various state-of-the-art graph querying techniques.

141 citations


Proceedings ArticleDOI
11 Aug 2013
TL;DR: A space efficient algorithm that approximates the transitivity and total triangle count with only a single pass through a graph given as a stream of edges, based on the classic probabilistic result, the birthday paradox is designed.
Abstract: We design a space efficient algorithm that approximates the transitivity (global clustering coefficient) and total triangle count with only a single pass through a graph given as a stream of edges. Our procedure is based on the classic probabilistic result, the birthday paradox. When the transitivity is constant and there are more edges than wedges (common properties for social networks), we can prove that our algorithm requires O(√n) space (n is the number of vertices) to provide accurate estimates. We run a detailed set of experiments on a variety of real graphs and demonstrate that the memory requirement of the algorithm is a tiny fraction of the graph. For example, even for a graph with 200 million edges, our algorithm stores just 60,000 edges to give accurate results. Being a single pass streaming algorithm, our procedure also maintains a real-time estimate of the transitivity/number of triangles of a graph, by storing a miniscule fraction of edges.

138 citations


Proceedings ArticleDOI
26 May 2013
TL;DR: This framework extends traditional discrete signal processing theory to structured datasets by viewing them as signals represented by graphs, so that signal coefficients are indexed by graph nodes and relations between them are represented by weighted graph edges.
Abstract: We propose a novel discrete signal processing framework for the representation and analysis of datasets with complex structure. Such datasets arise in many social, economic, biological, and physical networks. Our framework extends traditional discrete signal processing theory to structured datasets by viewing them as signals represented by graphs, so that signal coefficients are indexed by graph nodes and relations between them are represented by weighted graph edges. We discuss the notions of signals and filters on graphs, and define the concepts of the spectrum and Fourier transform for graph signals. We demonstrate their relation to the generalized eigenvector basis of the graph adjacency matrix and study their properties. As a potential application of the graph Fourier transform, we consider the efficient representation of structured data that utilizes the sparseness of graph signals in the frequency domain.

132 citations



Proceedings ArticleDOI
22 Jun 2013
TL;DR: A novel, efficient threshold-based graph decomposition algorithm, with time complexity O(l × |E|), to decompose a graph G at each iteration, where l usually is a small integer with l « |V|.
Abstract: Efficiently computing k-edge connected components in a large graph, G = (V, E), where V is the vertex set and E is the edge set, is a long standing research problem. It is not only fundamental in graph analysis but also crucial in graph search optimization algorithms. Consider existing techniques for computing k-edge connected components are quite time consuming and are unlikely to be scalable for large scale graphs, in this paper we firstly propose a novel graph decomposition paradigm to iteratively decompose a graph G for computing its k-edge connected components such that the number of drilling-down iterations h is bounded by the "depth" of the k-edge connected components nested together to form G, where h usually is a small integer in practice. Secondly, we devise a novel, efficient threshold-based graph decomposition algorithm, with time complexity O(l × |E|), to decompose a graph G at each iteration, where l usually is a small integer with l « |V|. As a result, our algorithm for computing k-edge connected components significantly improves the time complexity of an existing state-of-the-art technique from O(|V|2|E| + |V|3 log |V|) to O(h × l × |E|). Finally, we conduct extensive performance studies on large real and synthetic graphs. The performance studies demonstrate that our techniques significantly outperform the state-of-the-art solution by several orders of magnitude.

Proceedings ArticleDOI
22 Jun 2013
TL;DR: A new algorithm is developed that is provably I/O and CPU efficient at the same time, without making any assumption on the input G at all, and outperformed the existing competitors by a factor over an order of magnitude in extensive experimentation.
Abstract: This paper studies I/O-efficient algorithms for settling the classic triangle listing problem, whose solution is a basic operator in dealing with many other graph problems. Specifically, given an undirected graph G, the objective of triangle listing is to find all the cliques involving 3 vertices in G. The problem has been well studied in internal memory, but remains an urgent difficult challenge when G does not fit in memory, rendering any algorithm to entail frequent I/O accesses. Although previous research has attempted to tackle the challenge, the state-of-the-art solutions rely on a set of crippling assumptions to guarantee good performance. Motivated by this, we develop a new algorithm that is provably I/O and CPU efficient at the same time, without making any assumption on the input G at all. The algorithm uses ideas drastically different from all the previous approaches, and outperformed the existing competitors by a factor over an order of magnitude in our extensive experimentation.

Posted Content
TL;DR: In this article, the independence number of the maximal triangle-free graph was shown to be within a 4+o(1) factor of the best known upper bound on the Ramsey number.
Abstract: The triangle-free process begins with an empty graph on n vertices and iteratively adds edges chosen uniformly at random subject to the constraint that no triangle is formed. We determine the asymptotic number of edges in the maximal triangle-free graph at which the triangle-free process terminates. We also bound the independence number of this graph, which gives an improved lower bound on the Ramsey numbers R(3,t): we show R(3,t) > (1-o(1)) t^2 / (4 log t), which is within a 4+o(1) factor of the best known upper bound. Our improvement on previous analyses of this process exploits the self-correcting nature of key statistics of the process. Furthermore, we determine which bounded size subgraphs are likely to appear in the maximal triangle-free graph produced by the triangle-free process: they are precisely those triangle-free graphs with density at most 2.

Proceedings ArticleDOI
08 Apr 2013
TL;DR: In this article, the authors propose to find all instances of a given sample graph in a larger data graph using a single round of map-reduce, using the techniques of multiway joins.
Abstract: The theme of this paper is how to find all instances of a given “sample” graph in a larger “data graph,” using a single round of map-reduce. For the simplest sample graph, the triangle, we improve upon the best known such algorithm. We then examine the general case, considering both the communication cost between mappers and reducers and the total computation cost at the reducers. To minimize communication cost, we exploit the techniques of [1] for computing multiway joins (evaluating conjunctive queries) in a single map-reduce round. Several methods are shown for translating sample graphs into a union of conjunctive queries with as few queries as possible. We also address the matter of optimizing computation cost. Many serial algorithms are shown to be “convertible,” in the sense that it is possible to partition the data graph, explore each partition in a separate reducer, and have the total computation cost at the reducers be of the same order as the computation cost of the serial algorithm.

Journal ArticleDOI
TL;DR: An algorithm for constructing a spectral sparsifier of G with O(nlogn/ϵ2) edges, taking $\tilde{O}(m)$ time and requiring only one pass over G, which has the property that it maintains at all times a valid sparsifiers for the subgraph of G that the authors have received.
Abstract: Let G be a graph with n vertices and m edges. A sparsifier of G is a sparse graph on the same vertex set approximating G in some natural way. It allows us to say useful things about G while considering much fewer than m edges. The strongest commonly-used notion of sparsification is spectral sparsification; H is a spectral sparsifier of G if the quadratic forms induced by the Laplacians of G and H approximate one another well. This notion is strictly stronger than the earlier concept of combinatorial sparsification. In this paper, we consider a semi-streaming setting, where we have only $\tilde{O}(n)$ storage space, and we thus cannot keep all of G. In this case, maintaining a sparsifier instead gives us a useful approximation to G, allowing us to answer certain questions about the original graph without storing all of it. We introduce an algorithm for constructing a spectral sparsifier of G with O(nlogn/ϵ 2) edges (where ϵ is a parameter measuring the quality of the sparsifier), taking $\tilde{O}(m)$ time and requiring only one pass over G. In addition, our algorithm has the property that it maintains at all times a valid sparsifier for the subgraph of G that we have received. Our algorithm is natural and conceptually simple. As we read edges of G, we add them to the sparsifier H. Whenever H gets too big, we resparsify it in $\tilde{O}(n)$ time. Adding edges to a graph changes the structure of its sparsifier’s restriction to the already existing edges. It would thus seem that the above procedure would cause errors to compound each time that we resparsify, and that we should need to either retain significantly more information or reexamine previously discarded edges in order to construct the new sparsifier. However, we show how to use the information contained in H to perform this resparsification using only the edges retained by earlier steps in nearly linear time.

Proceedings ArticleDOI
26 May 2013
TL;DR: This work proposes a novel discrete signal processing framework for structured datasets that arise from social, economic, biological, and physical networks and demonstrates the application of graph filters to data classification by demonstrating that a classifier can be interpreted as an adaptive graph filter.
Abstract: We propose a novel discrete signal processing framework for structured datasets that arise from social, economic, biological, and physical networks. Our framework extends traditional discrete signal processing theory to datasets with complex structure that can be represented by graphs, so that data elements are indexed by graph nodes and relations between elements are represented by weighted graph edges. We interpret such datasets as signals on graphs, introduce the concept of graph filters for processing such signals, and discuss important properties of graph filters, including linearity, shift-invariance, and invertibility. We then demonstrate the application of graph filters to data classification by demonstrating that a classifier can be interpreted as an adaptive graph filter. Our experiments demonstrate that the proposed approach achieves high classification accuracy.

Journal ArticleDOI
TL;DR: This paper investigates whether the Jensen-Shannon divergence can be used as a means of establishing a graph kernel, and uses kernel principle components analysis (kPCA) to embed graphs into a feature space.
Abstract: Graph-based representations have been proved powerful in computer vision. The challenge that arises with large amounts of graph data is that of computationally burdensome edit distance computation. Graph kernels can be used to formulate efficient algorithms to deal with high dimensional data, and have been proved an elegant way to overcome this computational bottleneck. In this paper, we investigate whether the Jensen-Shannon divergence can be used as a means of establishing a graph kernel. The Jensen-Shannon kernel is nonextensive information theoretic kernel, and is defined using the entropy and mutual information computed from probability distributions over the structures being compared. To establish a Jensen-Shannon graph kernel, we explore two different approaches. The first of these is based on the von Neumann entropy associated with a graph. The second approach uses the Shannon entropy associated with the probability state vector for a steady state random walk on a graph. We compare the two resulting graph kernels for the problem of graph clustering. We use kernel principle components analysis (kPCA) to embed graphs into a feature space. Experimental results reveal that the method gives good classification results on graphs extracted both from an object recognition database and from an application in bioinformation.

Journal ArticleDOI
TL;DR: The general bounds of the metric dimension of a lexicographic product of any connected graph G and an arbitrary graph H are given and it is shown that the bounds are sharp.

Journal ArticleDOI
TL;DR: In this article, an even simpler linear-time algorithm is presented that computes a structure from which both the 2-vertex- and 2-edge-connectivity of a graph can be easily ''read [email protected]?"'.

Proceedings Article
03 Aug 2013
TL;DR: The key idea is to repeatedly generate a small number of "supernodes" connected to the regular nodes, in order to compress the original graph into a sparse bipartite graph.
Abstract: Graph clustering has received growing attention in recent years as an important analytical technique, both due to the prevalence of graph data, and the usefulness of graph structures for exploiting intrinsic data characteristics. However, as graph data grows in scale, it becomes increasingly more challenging to identify clusters. In this paper we propose an efficient clustering algorithm for large-scale graph data using spectral methods. The key idea is to repeatedly generate a small number of "supernodes" connected to the regular nodes, in order to compress the original graph into a sparse bipartite graph. By clustering the bipartite graph using spectral methods, we are able to greatly improve efficiency without losing considerable clustering power. Extensive experiments show the effectiveness and efficiency of our approach.

Journal ArticleDOI
TL;DR: An efficient algorithm that combines search and pruning strategies to look for the most relevant topological patterns is presented and three interestingness measures ofTopological patterns that differ by the pairs of vertices considered while evaluating up and down co-variations between vertex descriptors are proposed.
Abstract: We propose to mine the graph topology of a large attributed graph by finding regularities among vertex descriptors. Such descriptors are of two types: 1) the vertex attributes that convey the information of the vertices themselves and 2) some topological properties used to describe the connectivity of the vertices. These descriptors are mostly of numerical or ordinal types and their similarity can be captured by quantifying their covariation. Mining topological patterns relies on frequent pattern mining and graph topology analysis to reveal the links that exist between the relation encoded by the graph and the vertex attributes. We propose three interestingness measures of topological patterns that differ by the pairs of vertices considered while evaluating up and down co-variations between vertex descriptors. An efficient algorithm that combines search and pruning strategies to look for the most relevant topological patterns is presented. Besides a classical empirical study, we report case studies on four real-life networks showing that our approach provides valuable knowledge.

Journal ArticleDOI
TL;DR: In this paper, the authors considered the problem of target set selection in block-cactus graphs, chordal graphs, and Hamming graphs and showed that if the underlying graph G is a chordal graph with thresholds θ(v) ≥ 2 for each vertex v in G, then the problem can be solved in linear time.
Abstract: In this paper we consider a fundamental problem in the area of viral marketing, called Target Set Selection problem. We study the problem when the underlying graph is a block-cactus graph, a chordal graph or a Hamming graph. We show that if G is a block-cactus graph, then the Target Set Selection problem can be solved in linear time, which generalizes Chen’s result (Discrete Math. 23:1400–1415, 2009) for trees, and the time complexity is much better than the algorithm in Ben-Zwi et al. (Discrete Optim., 2010) (for bounded treewidth graphs) when restricted to block-cactus graphs. We show that if the underlying graph G is a chordal graph with thresholds θ(v)≤2 for each vertex v in G, then the problem can be solved in linear time. For a Hamming graph G having thresholds θ(v)=2 for each vertex v of G, we precisely determine an optimal target set S for (G,θ). These results partially answer an open problem raised by Dreyer and Roberts (Discrete Appl. Math. 157:1615–1627, 2009).

Journal ArticleDOI
TL;DR: This paper presents two methods of realising arbitrarily complex directed graphs as robust heteroclinic networks for flows generated by ODEs and realises the graph as an invariant set within an attractor, and discusses some illustrative examples.

Journal ArticleDOI
TL;DR: It is proved that every graph with maximum average degree at most 14 5 is ( 1, 1) -colorable, and it follows that every planar graph with girth at least 7 is (1, 1 ) - colorable.

Proceedings ArticleDOI
27 Oct 2013
TL;DR: This paper derives a lower bound, branch-based bound, which can greatly reduce the search space of the graph similarity search, and proposes a tree index structure, namely b-tree, to facilitate effective pruning and efficient query processing.
Abstract: Due to many real applications of graph databases, it has become increasingly important to retrieve graphs g (in graph database D) that approximately match with query graph q, rather than exact subgraph matches. In this paper, we study the problem of graph similarity search, which retrieves graphs that are similar to a given query graph under the constraint of the minimum edit distance. Specifically, we derive a lower bound, branch-based bound, which can greatly reduce the search space of the graph similarity search. We also propose a tree index structure, namely b-tree, to facilitate effective pruning and efficient query processing. Extensive experiments confirm that our proposed approach outperforms the existing approaches by orders of magnitude, in terms of both pruning power and query response time.

Journal ArticleDOI
TL;DR: It is shown that the structure of a planar graph on n vertices, and with constant maximum degree d, is determined, up to the modification (insertion or deletion) of at most $\epsilon d n$ edges, by the frequency of $k$-discs for certain $k=k(\ep silon,d)$ that is independent of the size of the graph.
Abstract: A $k$-disc around a vertex $v$ of a graph $G=(V,E)$ is the subgraph induced by all vertices of distance at most $k$ from $v$. We show that the structure of a planar graph on $n$ vertices, and with constant maximum degree $d$, is determined, up to the modification (insertion or deletion) of at most $\epsilon d n$ edges, by the frequency of $k$-discs for certain $k=k(\epsilon,d)$ that is independent of the size of the graph. We can replace planar graphs by any hyperfinite class of graphs, which includes, for example, every graph class that does not contain a set of forbidden minors. A pure combinatorial consequence of this result is that two $d$-bounded degree graphs that have similar frequency vectors (that is, the $\ell_1$ difference between the frequency vectors is small) are close to isomorphic (where close here means that by inserting or deleting not too many edges in one of them, it becomes isomorphic to the other). We also obtain the following new results in the area of property testing, which are es...

Journal ArticleDOI
TL;DR: For conflict graphs it is proved that the maximum flow problem is strongly $\mathcal{NP}$-hard, even if the conflict graph consists only of unconnected edges, and for forcing graphs the results imply that no polynomial time approximation algorithm can exist for both problems.
Abstract: We study the maximum flow problem subject to binary disjunctive constraints in a directed graph: A negative disjunctive constraint states that a certain pair of arcs in a digraph cannot be simultaneously used for sending flow in a feasible solution. In contrast to this, positive disjunctive constraints force that for certain pairs of arcs at least one arc has to carry flow in a feasible solution. It is convenient to represent the negative disjunctive constraints in terms of a so-called conflict graph whose vertices correspond to the arcs of the underlying graph, and whose edges encode the constraints. Analogously we represent the positive disjunctive constraints by a so-called forcing graph. For conflict graphs we prove that the maximum flow problem is strongly $\mathcal{NP}$ -hard, even if the conflict graph consists only of unconnected edges. This result still holds if the network consists only of disjoint paths of length three. For forcing graphs we also provide a sharp line between polynomially solvable and strongly $\mathcal{NP}$ -hard instances for the case where the flow values are required to be integral. Moreover, our hardness results imply that no polynomial time approximation algorithm can exist for both problems. In contrast to this we show that the maximum flow problem with a forcing graph can be solved efficiently if fractional flow values are allowed.

Journal ArticleDOI
TL;DR: A message-passing algorithm for counting short cycles in a graph which is based on performing integer additions and subtractions in the nodes of the graph and passing extrinsic messages to adjacent nodes is presented.
Abstract: A message-passing algorithm for counting short cycles in a graph is presented. For bipartite graphs, which are of particular interest in coding, the algorithm is capable of counting cycles of length g, g+2, ..., 2g-2, where g is the girth of the graph. For a general (non-bipartite) graph, cycles of length g, g+1, ..., 2g-1 can be counted. The algorithm is based on performing integer additions and subtractions in the nodes of the graph and passing extrinsic messages to adjacent nodes. The complexity of the proposed algorithm grows as O(g |E|2), where |E| is the number of edges in the graph. For sparse graphs, the proposed algorithm significantly outperforms the existing algorithms, tailored for counting em short cycles, in terms of computational complexity and memory requirements. We also discuss a more generic and basic approach of counting short cycles which is based on matrix multiplication, and provide a message-passing interpretation for such an approach. We then demonstrate that an efficient implementation of the matrix multiplication approach has essentially the same complexity as the proposed message-passing algorithm.

Journal ArticleDOI
TL;DR: If G is an n-vertex maximal outerplanar graph, then @c(G)@?(n+t)/4, where t is the number of vertices of degree 2 in G, is the minimum cardinality of a dominating set of G.

Journal ArticleDOI
01 Nov 2013
TL;DR: A partition-based approach to tackle the graph similarity queries with edit distance constraints is presented, by dividing data graphs into variable-size non-overlapping partitions, and the edit distance constraint is converted to a graph containment constraint for candidate generation.
Abstract: Graphs are widely used to model complex data in many applications, such as bioinformatics, chemistry, social networks, pattern recognition, etc. A fundamental and critical query primitive is to efficiently search similar structures in a large collection of graphs. This paper studies the graph similarity queries with edit distance constraints. Existing solutions to the problem utilize fixed-size overlapping substructures to generate candidates, and thus become susceptible to large vertex degrees or large distance thresholds. In this paper, we present a partition-based approach to tackle the problem. By dividing data graphs into variable-size non-overlapping partitions, the edit distance constraint is converted to a graph containment constraint for candidate generation. We develop efficient query processing algorithms based on the new paradigm. A candidate pruning technique and an improved graph edit distance algorithm are also developed to further boost the performance. In addition, a cost-aware graph partitioning technique is devised to optimize the index. Extensive experiments demonstrate our approach significantly outperforms existing approaches.