scispace - formally typeset
Search or ask a question

Showing papers by "Anthony Tomasic published in 1996"


Proceedings ArticleDOI
27 May 1996
TL;DR: The Distributed Information Search COmponent (Disco) as discussed by the authors is a distributed mediator architecture for heterogeneous distributed databases that allows for the translation of queries between query languages and schemas.
Abstract: Access to large numbers of data sources introduces new problems for users of heterogeneous distributed databases. End users and application programmers must deal with unavailable data sources. Database administrators must deal with incorporating new sources into the model. Database implementers must deal with the translation of queries between query languages and schemas. The Distributed Information Search COmponent (Disco) addresses these problems. Query processing semantics are developed to process queries over data sources which do not return answers. Data modeling techniques manage connections to data sources. The component interface to data sources flexibly handles different query languages and translates queries. This paper describes (a) the distributed mediator architecture of Disco, (b) its query processing semantics, (C) the data model and its modeling of data source connections, and (d) the interface to underlying data sources.

276 citations


Proceedings ArticleDOI
01 Dec 1996
TL;DR: An algorithm is presented that modifies execution plans on-the-fly in response to unexpected delays in obtaining initial requested tuples from remote sources using a class of dynamic, run time query plan modification techniques that are called query plan scrambling.
Abstract: Accessing data from numerous widely distributed sources poses significant new challenges for query optimization and execution. Congestion and failures in the network can introduce highly variable response times for wide area data access. The paper is an initial exploration of solutions to this variability. We introduce a class of dynamic, run time query plan modification techniques that we call query plan scrambling. We present an algorithm that modifies execution plans on-the-fly in response to unexpected delays in obtaining initial requested tuples from remote sources. The algorithm both reschedules operators and introduces new operators into the query plan. We present simulation results that demonstrate how the technique effectively hides delays by performing other useful work while waiting for missing data to arrive.

118 citations



Journal ArticleDOI
TL;DR: This article focuses on a physical-index design that accommodates truncations, inverted-index caching, and database scaling in a distributed shared-nothing system and shows to have a strong effect on response time and throughput.
Abstract: Many information-retrieval systems provides access to abstracts. For example, Stanford University, through its FOLIO system, provides access to the INSPEC database of abstracts of the literature on physics, computer science, electrical engineering, etc. In this article, this database is studied by using a trace-driven simulation. It focuses on a physical-index design that accommodates truncations, inverted-index caching, and database scaling in a distributed shared-nothing system. All three issues are shown to have a strong effect on response time and throughput. Database scaling is explored in two ways. One way assumes an “optimal” configuration for a single host and then linearly scales the database by duplicating the host architecture as needed. The second way determines the optimal number of hosts given a fixed database size.

13 citations