TALE: A Tool for Approximate Large Graph Matching

doi:10.1109/ICDE.2008.4497505

Proceedings ArticleDOI

TALE: A Tool for Approximate Large Graph Matching

Yuanyuan Tian, +1 more

- pp 963-972

Chats0

TLDR

A novel indexing method that incorporates graph structural information in a hybrid index structure that achieves high pruning power and the index size scales linearly with the database size is proposed.

Abstract:

Large graph datasets are common in many emerging database applications, and most notably in large-scale scientific applications. To fully exploit the wealth of information encoded in graphs, effective and efficient graph matching tools are critical. Due to the noisy and incomplete nature of real graph datasets, approximate, rather than exact, graph matching is required. Furthermore, many modern applications need to query large graphs, each of which has hundreds to thousands of nodes and edges. This paper presents a novel technique for approximate matching of large graph queries. We propose a novel indexing method that incorporates graph structural information in a hybrid index structure. This indexing technique achieves high pruning power and the index size scales linearly with the database size. In addition, we propose an innovative matching paradigm to query large graphs. This technique distinguishes nodes by their importance in the graph structure. The matching algorithm first matches the important nodes of a query and then progressively extends these matches. Through experiments on several real datasets, this paper demonstrates the effectiveness and efficiency of the proposed method.

Citations

PDF

Open Access

More filters

BookDOI

Managing and Mining Graph Data

Charu C. Aggarwal, +1 more

TL;DR: This is the first comprehensive survey book in the emerging topic of graph data processing and contains extensive surveys on important graph topics such as graph languages, indexing, clustering, data generation, pattern mining, classification, keyword search, pattern matching, and privacy.

...read moreread less

Proceedings ArticleDOI

Large-scale malware indexing using function-call graphs

Xin Hu, +2 more

TL;DR: An efficient method to compute graph similarity that exploits structural and instruction-level information in the underlying malware programs, and a multi-resolution indexing scheme that uses a computationally economical feature vector for early pruning and resorts to a more accurate but computationally more expensive graph similarity function only when it needs to pinpoint the most similar neighbors.

...read moreread less

Proceedings ArticleDOI

Turboiso: towards ultrafast and robust subgraph isomorphism search in large graph databases

Wook-Shin Han, +2 more

TL;DR: This paper presents an efficient and robust subgraph search solution, called TurboISO, which is turbo-charged with two novel concepts, candidate region exploration and the combine and permute strategy (in short, Comb/Perm).

...read moreread less

Journal ArticleDOI

An in-depth comparison of subgraph isomorphism algorithms in graph databases

Jinsoo Lee, +3 more

TL;DR: Five state-of-the-art subgraph isomorphism algorithms in a common code base are implemented and compared by comparing them using many real-world datasets and their query loads and report surprising empirical findings.

...read moreread less

Proceedings ArticleDOI

GADDI: distance index based subgraph matching in biological networks

Shijie Zhang, +2 more

TL;DR: A novel distance measurement is proposed which reintroduces the idea of frequent substructures in a single large graph in a given large graph of thousands of vertices and the novel structure distance based approach (GADDI) is devised to efficiently find matches of the query graph.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Johnson: Computers and Intractability-A Guide to the Theory of NP-Completeness

Michael Randolph Garey

Book

Computers and Intractability: A Guide to the Theory of NP-Completeness

Michael Randolph Garey, +1 more

TL;DR: The second edition of a quarterly column as discussed by the authors provides a continuing update to the list of problems (NP-complete and harder) presented by M. R. Garey and myself in our book "Computers and Intractability: A Guide to the Theory of NP-Completeness,” W. H. Freeman & Co., San Francisco, 1979.

...read moreread less

Journal ArticleDOI

The Structure and Function of Complex Networks

Mark Newman

- 01 Jan 2003 -

Siam Review

TL;DR: Developments in this field are reviewed, including such concepts as the small-world effect, degree distributions, clustering, network correlations, random graph models, models of network growth and preferential attachment, and dynamical processes taking place on networks.

...read moreread less

Journal ArticleDOI

Space/time trade-offs in hash coding with allowable errors

Burton H. Bloom

- 01 Jul 1970 -

Communications of The ACM

TL;DR: Analysis of the paradigm problem demonstrates that allowing a small number of test messages to be falsely identified as members of the given set will permit a much smaller hash area to be used without increasing reject time.

...read moreread less

Journal ArticleDOI

The KEGG resource for deciphering the genome

Minoru Kanehisa, +4 more

- 01 Jan 2004 -

Nucleic Acids Research

TL;DR: A knowledge-based approach for network prediction is developed, which is to predict, given a complete set of genes in the genome, the protein interaction networks that are responsible for various cellular processes.

...read moreread less

Collapse

TALE: A Tool for Approximate Large Graph Matching

Citations

Managing and Mining Graph Data

Large-scale malware indexing using function-call graphs

Turboiso: towards ultrafast and robust subgraph isomorphism search in large graph databases

An in-depth comparison of subgraph isomorphism algorithms in graph databases

GADDI: distance index based subgraph matching in biological networks

References

Johnson: Computers and Intractability-A Guide to the Theory of NP-Completeness

Computers and Intractability: A Guide to the Theory of NP-Completeness

The Structure and Function of Complex Networks

Space/time trade-offs in hash coding with allowable errors

The KEGG resource for deciphering the genome

Related Papers (5)

An Algorithm for Subgraph Isomorphism

Graph indexing: a frequent structure-based approach

Closure-Tree: An Index Structure for Graph Queries

Fast best-effort pattern matching in large attributed graphs

A (sub)graph isomorphism algorithm for matching large graphs