scispace - formally typeset
Open AccessProceedings Article

DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases

Roy Goldman, +1 more
- pp 436-445
Reads0
Chats0
TLDR
The theoretical foundations of DataGuides are presented along with an algorithm for their creation and an overview of incremental maintenance, and performance results based on the implementation of dataGuides in the Lore DBMS for semistructured data are provided.
Abstract
In semistructured databases there is no schema fixed in advance. To provide the benefits of a schema in such environments, we introduce DataGuides: concise and accurate structural summaries of semistructured databases. DataGuides serve as dynamic schemas, generated from the database; they are useful for browsing database structure, formulating queries, storing information such as statistics and sample values, and enabling query optimization. This paper presents the theoretical foundations of DataGuides along with an algorithm for their creation and an overview of incremental maintenance. We provide performance results based on our implementation of DataGuides in the Lore DBMS for semistructured data. We also describe the use of DataGuides in Lore, both in the user interface to enable structure browsing and query formulation, and as a means of guiding the query processor and optimizing query execution.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

A survey of approaches to automatic schema matching

TL;DR: A taxonomy is presented that distinguishes between schema-level and instance-level, element- level and structure- level, and language-based and constraint-based matchers and is intended to be useful when comparing different approaches to schema matching, when developing a new match algorithm, and when implementing a schema matching component.
Journal ArticleDOI

Web mining research: a survey

TL;DR: This paper surveys the research in the area of Web mining, point out some confusions regarded the usage of the term Web mining and suggest three Web mining categories, which are then situate some of the research with respect to these three categories.
Proceedings Article

Indexing and Querying XML Data for Regular Path Expressions

Quanzhong Li, +1 more
TL;DR: Wang et al. as mentioned in this paper proposed a new system for indexing and storing XML data based on a numbering scheme for elements, which quickly determines the ancestor-descendant relationship between elements in the hierarchy of XML data.
Proceedings ArticleDOI

Semistructured data

TL;DR: A number of issues surrounding semistructured data are covered: finding a concise formulation, building a sufficiently expressive language for querying and transformation, and optimizat,ion problems.
Proceedings ArticleDOI

Graph indexing: a frequent structure-based approach

TL;DR: The gIndex approach not only provides and elegant solution to the graph indexing problem, but also demonstrates how database indexing and query processing can benefit form data mining, especially frequent pattern mining.
References
More filters
Book

Introduction to Automata Theory, Languages, and Computation

TL;DR: This book is a rigorous exposition of formal languages and models of computation, with an introduction to computational complexity, appropriate for upper-level computer science undergraduates who are comfortable with mathematical arguments.
Book

An Introduction to Automata Theory

TL;DR: Great Aunt Eugenia and other automata Sundry machines Implementing finite automata Implementation and realization Behavioural equivalence, SP partitions and reduced machines
Journal ArticleDOI

The Lorel Query Language for Semistructured Data

TL;DR: The main novelties of the Lorel language are the extensive use of coercion to relieve the user from the strict typing of OQL, which is inappropriate for semistructured data; and powerful path expressions, which permit a flexible form of declarative navigational access and are particularly suitable when the details of the structure are not known to the user.
Book

The object database standard: ODMG 2.0

TL;DR: With this book, standards are defined for object management systems and this will be the foundational book for object-oriented database product.
Journal ArticleDOI

Query-by-example: a data base language

M. M. Zloof
- 01 Dec 1977 - 
TL;DR: Discussed is a high-level data base management language that provides the user with a convenient and unified interface to query, update, define, and control a data base.