Open AccessProceedings Article
Optimizing Queries Across Diverse Data Sources
Laura M. Haas,Donald Kossmann,Edward L. Wimmers,Jun Yang +3 more
- pp 276-285
Reads0
Chats0
TLDR
This work presents the design of a query optimizer for Garlic, a middleware system designed to integrate data from a broad range of data sources with very different query capabilities, and describes the design and implementation.Abstract:
Businessestoday need to interrelate data stored in diverse systems with differing capabilities, ideally via a single high-level query interface. We present the design of a query optimizer for Garlic [C 95], a middleware system designedto integrate data from a broad range of data sources with very different query capabilities. Garlic’s optimizer extends the rule-based approach of [Loh88] to work in a heterogeneous environment, by defining generic rules for the middleware and using wrapper-provided rules to encapsulate the capabilities of each data source. This approach offers great advantages in terms of plan quality, extensibility to new sources, incremental implementationof rules for new sources, and the ability to express the capabilities of a diverse set of sources. We describe the design and implementationof this optimizer, and illustrate its actions through an example.read more
Citations
More filters
Journal ArticleDOI
The state of the art in distributed query processing
TL;DR: The paper presents the “textbook” architecture for distributed query processing and a series of techniques that are particularly useful for distributed database systems, and discusses different kinds of distributed systems such as client-server, middleware (multitier), and heterogeneous database systems and shows how query processing works in these systems.
Journal ArticleDOI
Eddies: continuously adaptive query processing
Ron Avnur,Joseph M. Hellerstein +1 more
TL;DR: This paper introduces a query processing mechanism called an eddy, which continuously reorders operators in a query plan as it runs, and describes the moments of symmetry during which pipelined joins can be easily reordered, and the synchronization barriers that require inputs from different sources to be coordinated.
Proceedings ArticleDOI
Reconciling schemas of disparate data sources: a machine-learning approach
TL;DR: LSD is a system that employs and extends current machine-learning techniques to semi-automatically find semantic mappings between the source schemas and the mediated schema, and its architecture is extensible to additional learners that may exploit new kinds of information.
Proceedings ArticleDOI
Extracting structured data from Web pages
TL;DR: This paper presents an algorithm that takes, as input, a set of template-generated pages, deduces the unknown template used to generate the pages, and extracts, as output, the values encoded in the pages.
Proceedings ArticleDOI
CrowdDB: answering queries with crowdsourcing
TL;DR: The design of CrowdDB is described, a major change is that the traditional closed-world assumption for query processing does not hold for human input, and important avenues for future work in the development of crowdsourced query processing systems are outlined.
References
More filters
Proceedings ArticleDOI
QBIC project: querying images by content, using color, texture, and shape
Carlton Wayne Niblack,R. Barber,Will Equitz,Myron D. Flickner,Eduardo H. Glasman,Dragutin Petkovic,Peter Cornelius Yanker,Christos Faloutsos,Gabriel Taubin +8 more
TL;DR: The main algorithms for color texture, shape and sketch query that are presented, show example query results, and discuss future directions are presented.
Proceedings ArticleDOI
Access path selection in a relational database management system
TL;DR: System R as mentioned in this paper is an experimental database management system developed to carry out research on the relational model of data, which chooses access paths for both simple (single relation) and complex queries (such as joins), given a user specification of desired data as a boolean expression of predicates.
Proceedings Article
Querying Heterogeneous Information Sources Using Source Descriptions
TL;DR: The Information Manifold is described, an implemented system that provides uniform access to a heterogeneous collection of more than 100 information sources, many of them on the WWW, and algorithms that use the source descriptions to prune effciently the set of information sources for a given query are described.
Book
The object database standard: ODMG 2.0
R. G. G. Cattell,Douglas K. Barry,Dirk Bartels,Mark Berler,Jeff Eastman,Sophie Gamerman,David Jordan,Adam Springer,Henry Strickland,Drew Wade +9 more
TL;DR: With this book, standards are defined for object management systems and this will be the foundational book for object-oriented database product.
Proceedings ArticleDOI
Object exchange across heterogeneous information sources
TL;DR: An object-based information exchange model and a corresponding query language are defined that are well suited for integration of diverse information sources and used to integrate heterogeneous bibliographic information sources.