XRANK: ranked keyword search over XML documents

doi:10.1145/872757.872762

Proceedings ArticleDOI

XRANK: ranked keyword search over XML documents

- pp 16-27

TLDR

The XRANK system is presented, designed to handle the novel features of XML keyword search, which naturally generalizes a hyperlink based HTML search engine such as Google and can be used to query a mix of HTML and XML documents.

Abstract:

We consider the problem of efficiently producing ranked results for keyword search queries over hyperlinked XML documents. Evaluating keyword search queries over hierarchical XML documents, as opposed to (conceptually) flat HTML documents, introduces many new challenges. First, XML keyword search queries do not always return entire documents, but can return deeply nested XML elements that contain the desired keywords. Second, the nested structure of XML implies that the notion of ranking is no longer at the granularity of a document, but at the granularity of an XML element. Finally, the notion of keyword proximity is more complex in the hierarchical XML data model. In this paper, we present the XRANK system that is designed to handle these novel features of XML keyword search. Our experimental results show that XRANK offers both space and performance benefits when compared with existing approaches. An interesting feature of XRANK is that it naturally generalizes a hyperlink based HTML search engine such as Google. XRANK can thus be used to query a mix of HTML and XML documents.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Efficient query evaluation on probabilistic databases

Nilesh Dalvi, +1 more

TL;DR: It is shown that the data complexity of some queries is #P-complete, which implies that these queries do not admit any efficient evaluation methods, and an optimization algorithm is described that can compute efficiently most queries.

...read moreread less

Journal ArticleDOI

YAGO: A Large Ontology from Wikipedia and WordNet

Fabian M. Suchanek, +2 more

- 01 Sep 2008 -

Journal of Web Semantics

TL;DR: YAGO is a large ontology with high coverage and precision, based on a clean logical model with a decidable consistency that allows representing n-ary relations in a natural way while maintaining compatibility with RDFS.

...read moreread less

Journal ArticleDOI

A survey of top-k query processing techniques in relational database systems

Ihab F. Ilyas, +2 more

- 15 Oct 2008 -

ACM Computing Surveys

TL;DR: This survey describes and classify top-k processing techniques in relational databases including query models, data access methods, implementation levels, data and query certainty, and supported scoring functions, and shows the implications of each dimension on the design of the underlying techniques.

...read moreread less

Book ChapterDOI

XSEarch: a semantic search engine for XML

Sara Cohen, +3 more

TL;DR: These experiments indicate that XSEarch is efficient, scalable and ranks quality results highly.

...read moreread less

Proceedings ArticleDOI

BLINKS: ranked keyword searches on graphs

Hao He, +3 more

TL;DR: BLINKS follows a search strategy with provable performance bounds, while additionally exploiting a bi-level index for pruning and accelerating the search, and offers orders-of-magnitude performance improvement over existing approaches.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

The anatomy of a large-scale hypertextual Web search engine

Sergey Brin, +1 more

TL;DR: This paper provides an in-depth description of Google, a prototype of a large-scale search engine which makes heavy use of the structure present in hypertext and looks at the problem of how to effectively deal with uncontrolled hypertext collections where anyone can publish anything they want.

...read moreread less

Journal Article

The Anatomy of a Large-Scale Hypertextual Web Search Engine.

Sergey Brin, +1 more

- 01 Jan 1998 -

Computer Networks

TL;DR: Google as discussed by the authors is a prototype of a large-scale search engine which makes heavy use of the structure present in hypertext and is designed to crawl and index the Web efficiently and produce much more satisfying search results than existing systems.

...read moreread less

Journal ArticleDOI

Authoritative sources in a hyperlinked environment

Jon Kleinberg

- 01 Sep 1999 -

Journal of the ACM

TL;DR: This work proposes and test an algorithmic formulation of the notion of authority, based on the relationship between a set of relevant authoritative pages and the set of “hub pages” that join them together in the link structure, and has connections to the eigenvectors of certain matrices associated with the link graph.

...read moreread less

Book

Automatic text processing: the transformation, analysis, and retrieval of information by computer

Gerard Salton

Proceedings ArticleDOI

Storing and querying ordered XML using a relational database system

Igor Tatarinov, +5 more

TL;DR: This paper shows that XML's ordered data model can indeed be efficiently supported by a relational database system, and proposes three order encoding methods that can be used to represent XML order in the relational data model, and also proposes algorithms for translating ordered XPath expressions into SQL using these encoding methods.

...read moreread less

Collapse

XRANK: ranked keyword search over XML documents

Citations

Efficient query evaluation on probabilistic databases

YAGO: A Large Ontology from Wikipedia and WordNet

A survey of top-k query processing techniques in relational database systems

XSEarch: a semantic search engine for XML

BLINKS: ranked keyword searches on graphs

References

The anatomy of a large-scale hypertextual Web search engine

The Anatomy of a Large-Scale Hypertextual Web Search Engine.

Authoritative sources in a hyperlinked environment

Automatic text processing: the transformation, analysis, and retrieval of information by computer

Storing and querying ordered XML using a relational database system

Related Papers (5)

XSEarch: a semantic search engine for XML

Efficient keyword search for smallest LCAs in XML databases

Keyword searching and browsing in databases using BANKS

Discover: keyword search in relational databases

DBXplorer: a system for keyword-based search over relational databases