Search or ask a question

Showing papers by "Anthony Tomasic published in 1996"

PDF

Open Access

Proceedings Article•DOI•

Scaling heterogeneous databases and the design of Disco

[...]

Anthony Tomasic, Louiqa Raschid, Patrick Valduriez

27 May 1996

TL;DR: The Distributed Information Search COmponent (Disco) as discussed by the authors is a distributed mediator architecture for heterogeneous distributed databases that allows for the translation of queries between query languages and schemas.

...read moreread less

Abstract: Access to large numbers of data sources introduces new problems for users of heterogeneous distributed databases. End users and application programmers must deal with unavailable data sources. Database administrators must deal with incorporating new sources into the model. Database implementers must deal with the translation of queries between query languages and schemas. The Distributed Information Search COmponent (Disco) addresses these problems. Query processing semantics are developed to process queries over data sources which do not return answers. Data modeling techniques manage connections to data sources. The component interface to data sources flexibly handles different query languages and translates queries. This paper describes (a) the distributed mediator architecture of Disco, (b) its query processing semantics, (C) the data model and its modeling of data source connections, and (d) the interface to underlying data sources.

...read moreread less

276 citations

Proceedings Article•DOI•

Scrambling query plans to cope with unexpected delays

[...]

Laurent Amsaleg¹, Anthony Tomasic², Michael J. Franklin¹, Tolga Urhan¹•Institutions (2)

University of Maryland, College Park¹, French Institute for Research in Computer Science and Automation²

01 Dec 1996

TL;DR: An algorithm is presented that modifies execution plans on-the-fly in response to unexpected delays in obtaining initial requested tuples from remote sources using a class of dynamic, run time query plan modification techniques that are called query plan scrambling.

...read moreread less

Abstract: Accessing data from numerous widely distributed sources poses significant new challenges for query optimization and execution. Congestion and failures in the network can introduce highly variable response times for wide area data access. The paper is an initial exploration of solutions to this variability. We introduce a class of dynamic, run time query plan modification techniques that we call query plan scrambling. We present an algorithm that modifies execution plans on-the-fly in response to unexpected delays in obtaining initial requested tuples from remote sources. The algorithm both reschedules operators and introduces new operators into the query plan. We present simulation results that demonstrate how the technique effectively hides delays by performing other useful work while waiting for missing data to arrive.

...read moreread less

118 citations

Scaling Heterogeneous Distributed Databases and the Design of DISCO

[...]

Anthony Tomasic, Louiqa Raschid, Patrick Valduriez

01 Jan 1996

16 citations

Journal Article•DOI•

Performance issues in distributed shared-nothing information-retrieval systems

[...]

Anthony Tomasic¹, Hector Garcia-Molina¹•Institutions (1)

Stanford University¹

01 Nov 1996-Information Processing and Management

TL;DR: This article focuses on a physical-index design that accommodates truncations, inverted-index caching, and database scaling in a distributed shared-nothing system and shows to have a strong effect on response time and throughput.

...read moreread less

Abstract: Many information-retrieval systems provides access to abstracts. For example, Stanford University, through its FOLIO system, provides access to the INSPEC database of abstracts of the literature on physics, computer science, electrical engineering, etc. In this article, this database is studied by using a trace-driven simulation. It focuses on a physical-index design that accommodates truncations, inverted-index caching, and database scaling in a distributed shared-nothing system. All three issues are shown to have a strong effect on response time and throughput. Database scaling is explored in two ways. One way assumes an “optimal” configuration for a single host and then linearly scales the database by duplicating the host architecture as needed. The second way determines the optimal number of hosts given a fixed database size.

...read moreread less

13 citations