Top 2 papers published by Eugene J. Shekita from IBM in 2007

Patent•

[...]

Marcus Fontoura¹, Andreas Neumann¹, Sridhar Rajagopalan¹, Eugene J. Shekita¹, Jason Zien¹ - Show less +1 more•Institutions (1)

IBM¹

06 Aug 2007

TL;DR: In this paper, a sort key is generated that includes a document identifier that indicates whether a section of a document associated with the sort key was an anchor text section or a context section, wherein the anchor text sections and the context text sections have the same document identifier.

...read moreread less

Abstract: Disclosed is a technique for indexing data. For each token in a set of documents, a sort key is generated that includes a document identifier that indicates whether a section of a document associated with the sort key is an anchor text section or a context section, wherein the anchor text section and the context text section have a same document identifier; it is determined whether a data field associated with the token is a fixed width; when the data field is a fixed width, the token is designated as one for which fixed width sort is to be performed; and, when the data field is a variable length, the token is designated as one for which a variable width sort is to be performed. The fixed width sort and the variable width sort are performed. For each document, the sort keys are used to bring together the anchor text section and the context section of that document.

...read moreread less

66 citations

Proceedings Article•

Impliance: A Next Generation Information Management Appliance

[...]

Bishwaranjan Bhattacharjee, Vuk Ercegovac, Joseph S. Glider, Richard A. Golding, Guy M. Lohman, Volker Markl, Hamid Pirahesh, Jun Rao, Robert M. Rees, Frederick Reiss, Eugene J. Shekita, Garret Swart¹ - Show less +8 more•Institutions (1)

IBM¹

01 Jan 2007

TL;DR: Impliance as mentioned in this paper is a next-generation information management system consisting of hardware and software components integrated to form an easy-to-administer appliance that can store, retrieve, and analyze all types of structured, semi-structured, and unstructured information.

...read moreread less

Abstract: Though database technology has been remarkably successful in building a large market and adapting to the changes of the last three decades, its impact on the broader market of information management is surprisingly limited. If we were to design an information management system from scratch, based upon today’s requirements and hardware capabilities, would it look anything like today’s database systems? In this paper, we introduce Impliance, a next-generation information management system consisting of hardware and software components integrated to form an easy-to-administer appliance that can store, retrieve, and analyze all types of structured, semi-structured, and unstructured information. We first summarize the trends that will shape information management for the foreseeable future. Those trends imply three major requirements for Impliance: (1) to be able to store, manage, and uniformly query and transform all data, not just structured records; (2) to be able to scale out as the volume of this data grows; and (3) to be simple and robust in operation. We then describe four key ideas that are uniquely combined in Impliance to address these requirements, namely the ideas of: (a) integrating software and off-the-shelf hardware into a generic information appliance; (b) automatically discovering, organizing, and managing all data – unstructured as well as structured – in a uniform way; (c) achieving scale-out by exploiting simple, massive parallel processing, and (d) virtualizing compute and storage resources to unify, simplify, and streamline the management of Impliance. Impliance is an ambitious, long-term effort to define simpler, more robust, and more scalable information systems for tomorrow’s enterprises.

...read moreread less

6 citations

Showing papers by "Eugene J. Shekita published in 2007"