Proceedings ArticleDOI
BLINKS: ranked keyword searches on graphs
Hao He,Haixun Wang,Jun Yang,Philip S. Yu +3 more
- pp 305-316
TLDR
BLINKS follows a search strategy with provable performance bounds, while additionally exploiting a bi-level index for pruning and accelerating the search, and offers orders-of-magnitude performance improvement over existing approaches.Abstract:
Query processing over graph-structured data is enjoying a growing number of applications. A top-k keyword search query on a graph finds the top k answers according to some ranking criteria, where each answer is a substructure of the graph containing all query keywords. Current techniques for supporting such queries on general graphs suffer from several drawbacks, e.g., poor worst-case performance, not taking full advantage of indexes, and high memory requirements. To address these problems, we propose BLINKS, a bi-level indexing and query processing scheme for top-k keyword search on graphs. BLINKS follows a search strategy with provable performance bounds, while additionally exploiting a bi-level index for pruning and accelerating the search. To reduce the index space, BLINKS partitions a data graph into blocks: The bi-level index stores summary information at the block level to initiate and guide search among blocks, and more detailed information for each block to accelerate search within blocks. Our experiments show that BLINKS offers orders-of-magnitude performance improvement over existing approaches.read more
Citations
More filters
BookDOI
Managing and Mining Graph Data
Charu C. Aggarwal,Haixun Wang +1 more
TL;DR: This is the first comprehensive survey book in the emerging topic of graph data processing and contains extensive surveys on important graph topics such as graph languages, indexing, clustering, data generation, pattern mining, classification, keyword search, pattern matching, and privacy.
Proceedings ArticleDOI
EASE: an effective 3-in-1 keyword search method for unstructured, semi-structured and structured data
TL;DR: An extended inverted index is proposed to facilitate keyword-based search, and a novel ranking mechanism for enhancing search effectiveness is presented, which achieves both high search efficiency and high accuracy.
Journal ArticleDOI
Comparing stars: on approximating graph edit distance
TL;DR: Three novel methods to compute the upper and lower bounds for the edit distance between two graphs in polynomial time are introduced and result shows that these methods achieve good scalability in terms of both the number of graphs and the size of graphs.
Journal ArticleDOI
On graph query optimization in large networks
Peixiang Zhao,Jiawei Han +1 more
TL;DR: The experimental studies demonstrate the effectiveness and scalability of SPath, which proves to be a more practical and efficient indexing method in addressing graph queries on large networks.
Proceedings ArticleDOI
Spark: top-k keyword query in relational databases
TL;DR: This paper proposes a new ranking formula by adapting existing IR techniques based on a natural notion of virtual document and proposes several efficient query processing methods for the new ranking method.
References
More filters
Journal ArticleDOI
Some simplified NP-complete graph problems
TL;DR: This paper shows that a number of NP - complete problems remain NP -complete even when their domains are substantially restricted, and determines essentially the lowest possible upper bounds on node degree for which the problems remainNP -complete.
Proceedings ArticleDOI
Keyword searching and browsing in databases using BANKS
TL;DR: BANKS is described, a system which enables keyword-based search on relational databases, together with data and schema browsing, and presents an efficient heuristic algorithm for finding and ranking query results.
Proceedings ArticleDOI
Optimal aggregation algorithms for middleware
TL;DR: An elegant and remarkably simple algorithm is analyzed that is optimal in a much stronger sense than FA, and is essentially optimal, not just for some monotone aggregation functions, but for all of them, and not just in a high-probability sense, but over every database.
Book ChapterDOI
Discover: keyword search in relational databases
TL;DR: It is proved that DISCOVER finds without redundancy all relevant candidate networks, whose size can be data bound, by exploiting the structure of the schema and the selection of the optimal execution plan (way to reuse common subexpressions) is NP-complete.
Journal ArticleDOI
Stuff I've Seen: A System for Personal Information Retrieval and Re-Use
Susan T. Dumais,Edward Cutrell,Jonathan J. Cadiz,Gavin Jancke,Raman K. Sarin,Daniel C. Robbins +5 more
TL;DR: The design and evaluation of a system, called Stuff I've Seen (SIS), that facilitates information re-use and provides a unified index of information that a person has seen, whether it was seen as email, web page, document, appointment, etc.