Open AccessProceedings Article
DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases
Roy Goldman,Jennifer Widom +1 more
- pp 436-445
Reads0
Chats0
TLDR
The theoretical foundations of DataGuides are presented along with an algorithm for their creation and an overview of incremental maintenance, and performance results based on the implementation of dataGuides in the Lore DBMS for semistructured data are provided.Abstract:
In semistructured databases there is no schema fixed in advance. To provide the benefits of a schema in such environments, we introduce DataGuides: concise and accurate structural summaries of semistructured databases. DataGuides serve as dynamic schemas, generated from the database; they are useful for browsing database structure, formulating queries, storing information such as statistics and sample values, and enabling query optimization. This paper presents the theoretical foundations of DataGuides along with an algorithm for their creation and an overview of incremental maintenance. We provide performance results based on our implementation of DataGuides in the Lore DBMS for semistructured data. We also describe the use of DataGuides in Lore, both in the user interface to enable structure browsing and query formulation, and as a means of guiding the query processor and optimizing query execution.read more
Citations
More filters
Journal ArticleDOI
A survey of approaches to automatic schema matching
Erhard Rahm,Philip A. Bernstein +1 more
TL;DR: A taxonomy is presented that distinguishes between schema-level and instance-level, element- level and structure- level, and language-based and constraint-based matchers and is intended to be useful when comparing different approaches to schema matching, when developing a new match algorithm, and when implementing a schema matching component.
Journal ArticleDOI
Web mining research: a survey
Raymond Kosala,Hendrik Blockeel +1 more
TL;DR: This paper surveys the research in the area of Web mining, point out some confusions regarded the usage of the term Web mining and suggest three Web mining categories, which are then situate some of the research with respect to these three categories.
Proceedings Article
Indexing and Querying XML Data for Regular Path Expressions
Quanzhong Li,Bongki Moon +1 more
TL;DR: Wang et al. as mentioned in this paper proposed a new system for indexing and storing XML data based on a numbering scheme for elements, which quickly determines the ancestor-descendant relationship between elements in the hierarchy of XML data.
Proceedings ArticleDOI
Semistructured data
TL;DR: A number of issues surrounding semistructured data are covered: finding a concise formulation, building a sufficiently expressive language for querying and transformation, and optimizat,ion problems.
Proceedings ArticleDOI
Graph indexing: a frequent structure-based approach
TL;DR: The gIndex approach not only provides and elegant solution to the graph indexing problem, but also demonstrates how database indexing and query processing can benefit form data mining, especially frequent pattern mining.
References
More filters
Book
Introduction to Automata Theory, Languages, and Computation
TL;DR: This book is a rigorous exposition of formal languages and models of computation, with an introduction to computational complexity, appropriate for upper-level computer science undergraduates who are comfortable with mathematical arguments.
Book
An Introduction to Automata Theory
TL;DR: Great Aunt Eugenia and other automata Sundry machines Implementing finite automata Implementation and realization Behavioural equivalence, SP partitions and reduced machines
Journal ArticleDOI
The Lorel Query Language for Semistructured Data
TL;DR: The main novelties of the Lorel language are the extensive use of coercion to relieve the user from the strict typing of OQL, which is inappropriate for semistructured data; and powerful path expressions, which permit a flexible form of declarative navigational access and are particularly suitable when the details of the structure are not known to the user.
Book
The object database standard: ODMG 2.0
R. G. G. Cattell,Douglas K. Barry,Dirk Bartels,Mark Berler,Jeff Eastman,Sophie Gamerman,David Jordan,Adam Springer,Henry Strickland,Drew Wade +9 more
TL;DR: With this book, standards are defined for object management systems and this will be the foundational book for object-oriented database product.
Journal ArticleDOI
Query-by-example: a data base language
TL;DR: Discussed is a high-level data base management language that provides the user with a convenient and unified interface to query, update, define, and control a data base.