scispace - formally typeset
Proceedings ArticleDOI

EASE: an effective 3-in-1 keyword search method for unstructured, semi-structured and structured data

Reads0
Chats0
TLDR
An extended inverted index is proposed to facilitate keyword-based search, and a novel ranking mechanism for enhancing search effectiveness is presented, which achieves both high search efficiency and high accuracy.
Abstract
Conventional keyword search engines are restricted to a given data model and cannot easily adapt to unstructured, semi-structured or structured data. In this paper, we propose an efficient and adaptive keyword search method, called EASE, for indexing and querying large collections of heterogenous data. To achieve high efficiency in processing keyword queries, we first model unstructured, semi-structured and structured data as graphs, and then summarize the graphs and construct graph indices instead of using traditional inverted indices. We propose an extended inverted index to facilitate keyword-based search, and present a novel ranking mechanism for enhancing search effectiveness. We have conducted an extensive experimental study using real datasets, and the results show that EASE achieves both high search efficiency and high accuracy, and outperforms the existing approaches significantly.

read more

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI

Spark: top-k keyword query in relational databases

TL;DR: This paper proposes a new ranking formula by adapting existing IR techniques based on a natural notion of virtual document and proposes several efficient query processing methods for the new ranking method.
Proceedings ArticleDOI

A Survey on Deep Learning in Big Data

TL;DR: This paper provides a comprehensive survey on what is Big Data, comparing methods, its research problems, and trends, and application of Deep Learning in Big data, its challenges, open research problems and future trends are presented.
Journal ArticleDOI

Effective community search for large attributed graphs

TL;DR: The results show that ACs are more effective and efficient than existing community retrieval approaches, and contains more precise and personalized information than that of existing community search and detection methods.
Journal ArticleDOI

Attribute-driven community search

TL;DR: This paper develops an efficient greedy algorithmic framework to iteratively remove nodes with the least popular attributes, and shrink the graph into an ATC, and builds an elegant index to maintain k-truss structure and attribute information, and proposes efficient query processing algorithms.
Journal Article

Keyword Search in Relational Databases: A Survey.

TL;DR: This work surveys the developments on finding structural information among tuples in an RDB using an l-keyword query, Q, which is a set of keywords of size l, denoted as Q = {k1, k2, · · · , kl}.
References
More filters
Proceedings ArticleDOI

Keyword searching and browsing in databases using BANKS

TL;DR: BANKS is described, a system which enables keyword-based search on relational databases, together with data and schema browsing, and presents an efficient heuristic algorithm for finding and ranking query results.
Book ChapterDOI

Discover: keyword search in relational databases

TL;DR: It is proved that DISCOVER finds without redundancy all relevant candidate networks, whose size can be data bound, by exploiting the structure of the schema and the selection of the optimal execution plan (way to reuse common subexpressions) is NP-complete.
Proceedings ArticleDOI

XRANK: ranked keyword search over XML documents

TL;DR: The XRANK system is presented, designed to handle the novel features of XML keyword search, which naturally generalizes a hyperlink based HTML search engine such as Google and can be used to query a mix of HTML and XML documents.
Proceedings ArticleDOI

DBXplorer: a system for keyword-based search over relational databases

TL;DR: DBXplorer, a system that enables keyword-based searches in relational databases using a commercial relational database and Web server and allows users to interact via a browser front-end is discussed.
Proceedings Article

Combining Fuzzy Information from Multiple Systems.

TL;DR: An algorithm is given, which has been implemented in Garlic, such that if the conjuncts are independent, then with arbitrarily high probability, the total number of elements retrieved in evaluating the query is sublinear in the database size.
Related Papers (5)