scispace - formally typeset
Proceedings ArticleDOI

TALE: A Tool for Approximate Large Graph Matching

Reads0
Chats0
TLDR
A novel indexing method that incorporates graph structural information in a hybrid index structure that achieves high pruning power and the index size scales linearly with the database size is proposed.
Abstract
Large graph datasets are common in many emerging database applications, and most notably in large-scale scientific applications. To fully exploit the wealth of information encoded in graphs, effective and efficient graph matching tools are critical. Due to the noisy and incomplete nature of real graph datasets, approximate, rather than exact, graph matching is required. Furthermore, many modern applications need to query large graphs, each of which has hundreds to thousands of nodes and edges. This paper presents a novel technique for approximate matching of large graph queries. We propose a novel indexing method that incorporates graph structural information in a hybrid index structure. This indexing technique achieves high pruning power and the index size scales linearly with the database size. In addition, we propose an innovative matching paradigm to query large graphs. This technique distinguishes nodes by their importance in the graph structure. The matching algorithm first matches the important nodes of a query and then progressively extends these matches. Through experiments on several real datasets, this paper demonstrates the effectiveness and efficiency of the proposed method.

read more

Content maybe subject to copyright    Report

Citations
More filters
BookDOI

Managing and Mining Graph Data

TL;DR: This is the first comprehensive survey book in the emerging topic of graph data processing and contains extensive surveys on important graph topics such as graph languages, indexing, clustering, data generation, pattern mining, classification, keyword search, pattern matching, and privacy.
Proceedings ArticleDOI

Large-scale malware indexing using function-call graphs

TL;DR: An efficient method to compute graph similarity that exploits structural and instruction-level information in the underlying malware programs, and a multi-resolution indexing scheme that uses a computationally economical feature vector for early pruning and resorts to a more accurate but computationally more expensive graph similarity function only when it needs to pinpoint the most similar neighbors.
Proceedings ArticleDOI

Turboiso: towards ultrafast and robust subgraph isomorphism search in large graph databases

TL;DR: This paper presents an efficient and robust subgraph search solution, called TurboISO, which is turbo-charged with two novel concepts, candidate region exploration and the combine and permute strategy (in short, Comb/Perm).
Journal ArticleDOI

An in-depth comparison of subgraph isomorphism algorithms in graph databases

TL;DR: Five state-of-the-art subgraph isomorphism algorithms in a common code base are implemented and compared by comparing them using many real-world datasets and their query loads and report surprising empirical findings.
Proceedings ArticleDOI

GADDI: distance index based subgraph matching in biological networks

TL;DR: A novel distance measurement is proposed which reintroduces the idea of frequent substructures in a single large graph in a given large graph of thousands of vertices and the novel structure distance based approach (GADDI) is devised to efficiently find matches of the query graph.
References
More filters
Book

Computers and Intractability: A Guide to the Theory of NP-Completeness

TL;DR: The second edition of a quarterly column as discussed by the authors provides a continuing update to the list of problems (NP-complete and harder) presented by M. R. Garey and myself in our book "Computers and Intractability: A Guide to the Theory of NP-Completeness,” W. H. Freeman & Co., San Francisco, 1979.
Journal ArticleDOI

The Structure and Function of Complex Networks

Mark Newman
- 01 Jan 2003 - 
TL;DR: Developments in this field are reviewed, including such concepts as the small-world effect, degree distributions, clustering, network correlations, random graph models, models of network growth and preferential attachment, and dynamical processes taking place on networks.
Journal ArticleDOI

Space/time trade-offs in hash coding with allowable errors

TL;DR: Analysis of the paradigm problem demonstrates that allowing a small number of test messages to be falsely identified as members of the given set will permit a much smaller hash area to be used without increasing reject time.
Journal ArticleDOI

The KEGG resource for deciphering the genome

TL;DR: A knowledge-based approach for network prediction is developed, which is to predict, given a complete set of genes in the genome, the protein interaction networks that are responsible for various cellular processes.
Related Papers (5)