Top 2 papers published by Eugene J. Shekita from IBM in 2008

Proceedings Article•DOI•

[...]

Ori Ben-Yitzhak¹, Nadav Golbandi¹, Nadav Har'El¹, Ronny Lempel², Andreas Neumann¹, Shila Ofek-Koifman¹, Dafna Sheinwald¹, Eugene J. Shekita¹, Benjamin Sznajder¹, Sivan Yogev¹ - Show less +6 more•Institutions (2)

IBM¹, Yahoo!²

11 Feb 2008

TL;DR: This paper extends traditional faceted search to support richer information discovery tasks over more complex data models, and adds exible, dynamic business intelligence aggregations to the faceted application, enabling users to gain insight into their data far richer than just knowing the quantities of documents belonging to each facet.

...read moreread less

Abstract: This paper extends traditional faceted search to support richer information discovery tasks over more complex data models. Our first extension adds exible, dynamic business intelligence aggregations to the faceted application, enabling users to gain insight into their data that is far richer than just knowing the quantities of documents belonging to each facet. We see this capability as a step toward bringing OLAP capabilities, traditionally supported by databases over relational data, to the domain of free-text queries over metadata-rich content. Our second extension shows how one can efficiently extend a faceted search engine to support correlated facets - a more complex information model in which the values associated with a document across multiple facets are not independent. We show that by reducing the problem to a recently solved tree-indexing scenario, data with correlated facets can be efficiently indexed and retrieved

...read moreread less

174 citations

Proceedings Article•DOI•

Supporting sub-document updates and queries in an inverted index

[...]

Vuk Ercegovac¹, Vanja Josifovski², Ning Li¹, Mauricio Mediano², Eugene J. Shekita¹ - Show less +1 more•Institutions (2)

IBM¹, Yahoo!²

26 Oct 2008

TL;DR: A novel self-optimizing query execution algorithm is described to efficiently join the sections of a document in the inverted index, showing that sections can dramatically improve overall system throughput on a mixed workload of updates and queries.

...read moreread less

Abstract: Inverted indexes have become the standard indexing method for supporting search queries in a variety of content-based applications. Examples of such applications include enterprise document management, e-mail, web search, and social networks. One shortcoming in current inverted index designs is that they support only document-level updates, forcing a full document to be reindexed even if just part of it changes. This paper describes a new inverted index design that enables applications to break a document into semantically meaningful sub-documents or "sections". Each section of a document can be updated separately, but search queries can still work seamlessly across sections. Our index design is motivated by applications where there is metadata associated with each document that tends to be smaller and more frequently updated than the document's content, but at the same time, it is desireable to search the metadata and content with the same index structure. A novel self-optimizing query execution algorithm is described to efficiently join the sections of a document in the inverted index. Experimental results on TREC and patent data are provided, showing that sections can dramatically improve overall system throughput on a mixed workload of updates and queries.

...read moreread less

53 citations

Showing papers by "Eugene J. Shekita published in 2008"