scispace - formally typeset
Search or ask a question

Showing papers on "Data warehouse published in 1994"


Book
27 Jul 1994
TL;DR: The Data Warehouse and the ODS: Architecture for Information Systems and the Manager's Perspective is presented.
Abstract: The Data Warehouse: A Definition. The Operational Data Store. The Data Warehouse and the ODS: Architecture for Information Systems. Using the Data Warehouse: The Manager's Perspective. Using the Data Warehouse: The End User's Perspective. Using the Data Warehouse: Creating the Analysis. Using Information for Competitive Advantage: Some Examples. Administering the Data Warehouse Environment. Migration to the Architected Environment. Connecting to the Data Warehouse. Index.

196 citations


Patent
04 Oct 1994
TL;DR: In this paper, a schema of the data in a database warehouse is presented to a user. But the schema consists of virtual tables, and the arrangement of data in the virtual tables is different from the arrangement in the fact tables and the reference tables.
Abstract: A database warehouse includes a database having data arranged in data tables, e.g., fact tables and reference tables. A warehouse database hub interface is connected to the database. The warehouse database hub interface presents to a user a schema of the data in the database warehouse. The schema consists of virtual tables. Arrangement of the data in the virtual tables is different than arrangement of data in the fact tables and the reference tables. A user generates queries based on the schema provided by the warehouse database hub interface. In response to a such a query for particular information stored in the database warehouse, the warehouse database hub interface modifies the query to take into account pre-computed values and the arrangement of the data within the database warehouse. Then the warehouse database hub interface queries the database warehouse using the modified query to obtain the particular information from the database warehouse. Finally, the warehouse database hub interface forwards the particular information obtained from the database warehouse to the user.

141 citations


Book
01 Jan 1994
TL;DR: This book provides a thorough grounding in theory before guiding the reader through the various stages of applied data modeling and database design, and has been given significantly expanded coverage and reorganized for greater reader comprehension.
Abstract: Data Modeling Essentials, Third Edition provides expert tutelage for data modelers, business analysts and systems designers at all levels. Beginning with the basics, this book provides a thorough grounding in theory before guiding the reader through the various stages of applied data modeling and database design. Later chapters address advanced subjects, including business rules, data warehousing, enterprise-wide modeling and data management.The third edition of this popular book retains its distinctive hallmarks of readability and usefulness, but has been given significantly expanded coverage and reorganized for greater reader comprehension. Authored by two leaders in the field, Data Modeling Essentials, Third Edition is the ideal reference for professionals and students looking for a real-world perspective. Thorough coverage of the fundamentals and relevant theory. Recognition and support for the creative side of the process. Expanded coverage of applied data modeling includes new chapters on logical and physical database design. New material describing a powerful technique for model verification. Unique coverage of the practical and human aspects of modeling, such as working with business specialists, managing change, and resolving conflict. Extensive online component including course notes and other teaching aids (www.mkp.com).UML diagrams now available! Visit the companion site for more details.Click here to view a book review by Steve Hoberman!

93 citations



Book
01 Jul 1994
TL;DR: In this paper, the authors take a step beyond the technological architectural aspects of information technology (IT) to reveal it as a driver of corporate strategy formulation, illustrated as the essential technological vehicle required to slice and dice the market into the micro segments that will enable mass production industrial organisations to transform themselves into masscustomisation customer-focused businesses.
Abstract: From the Publisher: In this book Sean Kelly takes a step beyond the technological architectural aspects of information technology (IT) to reveal it as a driver of corporate strategy formulation. The data warehouse is illustrated as the essential technological vehicle required to slice and dice the market into the micro segments that will enable mass production industrial organisations to transform themselves into mass-customisation customer-focused businesses. Sean Kelly provides a coherent conceptual framework of the cognitive process. He charts a course for these organisations embarking on the corporate data warehouse journey, allowing them to see the totality of data in the enterprise constructed into integrated patterns and pictures which can add dramatic value to decision making. Corporate directors, information systems professionals and business users of information systems alike cannot afford to ignore the message in this book.

37 citations


Book
14 Mar 1994
TL;DR: This document summarizes the main findings of a two-year investigation into the design and quality of the distributed data architecture used by Facebook and Google in the period of May 21 to 29, 2013.
Abstract: Shared Data Vision. Common Data Architecture. Data Names. Data Definition. Data Structure. Data Quality. Data Documentation. Data Refining. Disparate Data Description. Disparate Data Structure. Disparate Data Quality. Data Cross Reference. Data Translation. Mature Data Resource. Data Completeness. Shared Data Reality. Postscript. Appendices. Glossary. Bibliography. Index.

22 citations


Journal ArticleDOI
TL;DR: In this article, the authors propose an extension of the MVC (model, view, controller) web-based framework that allows for the rapid development and implementation of data integration systems solutions suitable for use on the web.
Abstract: The integration of multiple autonomous and heterogeneous data sources (both across the web and via a company intranet) has received much attention throughout the years, particularly due to its many applications in the fields of Artificial Intelligence and medical research data sharing. Data integration systems embody this work and have come very far in the past twenty years. The problem of designing such systems is characterized by a number of issues that are interesting from a theoretical point of view: answering queries using logical views, query containment and completeness, automatic integration of existing data sources via schema mapping tools, etc. In this work we discuss these issues, compare and contrast various proposed solutions (federated database systems and data warehouses), and finally propose a novel extension of the MVC (model, view, controller) web-based framework that allows for the rapid development and implementation of data integration systems solutions suitable for use on the web.

15 citations


Proceedings ArticleDOI
24 May 1994
TL;DR: Some of the dimensions of the replication solution space including latency, concurrency, logical and physical units of replication, network link requirements, heterogeneity, replica topology, replica transparency, and data transformation requirements are identified.
Abstract: Data replication has recently become a topic of increased interest among customers. Several database vendors provide products that perform data replication, The capabilities of these products and the customer problems they solve vary widely. This talk starts by identifying some of the dimensions of the replication solution space including latency, concurrency, logical and physical units of replication, network link requirements, heterogeneity, replica topology, replica transparency, and data transformation requirements. Digital Equipment Corporation provides three products that allow customers to replicate data. The distributed, two-phase commit products allow customers to program and coordinate replicated updates. DECTM Reliable Transaction Router provides an OLTP environment with transactional data replication. Transactions succeed in the face of site and network failures. DECTM Data Distributor manages the automated distribution of data using scheduled or ad-hoc batch data transfers. Data replication can be used to distribute data to remote sites, combine data from multiple sites, and archive data. In the process of moving the data from one site to another, data transformations carI be applied. The balance of the talk describes the Data Distributor, explaining the replica topologies it supports, their application in solving customer problems, and the internal mechanisms used to support them.

12 citations


Proceedings ArticleDOI
01 Jan 1994
TL;DR: An object-oriented database management system can be used to implement an object- oriented heterogeneous database (OOHDB) that efficiently addresses the problem of heterogeneity among large scientific property databases.
Abstract: There are many problems facing scientists in their efforts to use computers to manage, access, and analyze data. One serious problem is the heterogeneity of data sources. Previous research into accessing heterogeneous data has not addressed problems with accessing large data sources that are not managed by database management systems. We believe that an object-oriented database management system can be used to implement an object-oriented heterogeneous database (OOHDB) that efficiently addresses the problem of heterogeneity among large scientific property databases. The object-oriented data model provides a powerful modeling paradigm for representing complex scientific data. Objects within the OOHDB hide heterogeneity among various data sources. Efficient access to external data is achieved by representing external data with a small amount of data stored within the OOHDB. >

9 citations


Proceedings ArticleDOI
28 Sep 1994
TL;DR: It is argued that accessing and manipulating each data requires the construction of an abstract data model that intuitively models how a domain specialist views the data.
Abstract: Scientific data are often characterized by large volumes of data, complex data relationships, and complex algorithms that operate on those data. Databases at the Space Telescope Science Institute are used to illustrate some of the differences between traditional data management schemes and the requirements of scientific data. Data from radio astronomy are modeled using conceptual enhancements to flexible image transport system (FITS) binary tables, which are compared to relational database tables. These enhancements are used to capture features of a more general notion of an observation. Finally, it is argued that accessing and manipulating each data, in order to support the entire process of data analysis, requires the construction of an abstract data model that intuitively models how a domain specialist views the data. >

7 citations


01 Jan 1994
TL;DR: This dissertation addresses a user's need to formulate queries to multiple heterogeneous databases easily, and to have confidence in the results that are returned, and provides an intelligent heterogeneous database architecture that provides a framework for the above.
Abstract: Existing (or legacy) databases are typified by differences in data representation, data access languages, and differing data models. Data representation differences include name, format and structural differences for identical and similar data stored in more than one legacy database. Data access language differences may require multiple queries to complete the retrieval of all values of a data element stored in more than one legacy database. And differences in data model constructs may result in similarly named data elements being represented at different levels of abstraction which exhibit different properties. These differences make access difficult for most users. To resolve such problems, this dissertation addresses a user's need to formulate queries to multiple heterogeneous databases easily, and to have confidence in the results that are returned. The Intelligent Heterogeneous Autonomous Database Architecture (InHead) approach involves the use of Artificial Intelligence tools and techniques to construct "domain models," that is data and knowledge representations of the constituent databases and an overall domain model of the semantic interactions among the databases. These domain models are represented as Knowledge Sources (KSs) in a blackboard architecture. The work described in this dissertation provides four major contributions. The first is the specification of an active and intelligent global thesaurus. The second contribution is the extension of the traditional notion of an export schema into that of an "Export Data/Knowledge/Task" schema. The third contribution is the specification and use of "Data/knowledge Packets," which are a means of encapsulating object structure, relationships, operations, constraints, and rules into a meaningful unit, or packet. The fourth contribution is the specification an intelligent heterogeneous database architecture that provides a framework for the above.


Proceedings Article
12 Sep 1994
TL;DR: This paper reports on the managerial experience, technical approach, and lessons learned from reengineering eight departmental large-scale information systems to migrate these systems into a set of enterprise-wide systems, which incorporate current and future requirements, drastically reduce operational and maintenance cost, and facilitate common understandings among stakeholders.
Abstract: This paper reports on the managerial experience, technical approach, and lessons learned from reengineering eight departmental large-scale information systems. The driving strategic objective of each project was to migrate these systems into a set of enterprise-wide systems, which incorporate current and future requirements, drastically reduce operational and maintenance cost, and facilitate common understandings among stakeholders (i.e., policy maker, high-level management, IS developer/maintainer/ end-users). A logical data model , which contains requirements, rules, physical data representation as well as logical data object, clearly documents the baseline data requirements implemented by the legacy system and is crucial to achieve this strategic goal. Re-engineering products are captured in the dictionaries of a CASE tool (i.e., in the form of a business process decomposition hierarchy, as-is data model, normalized logical data model, and linkages among data objects) and are supplemented with traceability matrices in spreadsheets. The re-engineered data products are used as follows: (1) migration of the legacy databases to relational database management systems, (2) automatically generation of databases and applications for migration from mainframes to client-server, (3) enterprise data standardization, (4) integration of disparate information systems, (5) re-documentation, (6) data quality assessment and assurance, and (7) baseline specifications for future systems. Pemdssion to copy withoutfee all or part of this material ir grantedprovided that the copies are not made or distributedfor direct commercial advantage, the VLDB copyright notice and the title of the publication and it3 date appear, and notice is given that copying b by permission of tk Very Large Dota Bare Endowment. To copy otherwise, or to republish, requires a fee on&or special permission from the EJt&wment. Proceedings of the 20th VLDB Conference Santiago, Chile, 1994 Christian T. Ramiller Hughes Information Technology Corporation

Journal ArticleDOI
J. P. Singleton1, M. M. Schwartz1
TL;DR: Concepts of independence between software components and how this independence can provide flexibility for change are discussed as a key emphasis of this paper.
Abstract: IBM's Information Warehouse™ framework provides a basis for satisfying enterprise requirements for effective use of business data resources. It includes an architecture that defines the structure and interfaces for integrated solutions and includes products and services that can be used to create solutions. This paper uses the Information Warehouse architecture as a context to describe software components that can be used for direct access to formatted business data in a heterogeneous systems environment. Concepts of independence between software components and how this independence can provide flexibility for change are discussed. The integration of software from multiple vendors to create effective solutions is a key emphasis of this paper.


Proceedings ArticleDOI
28 Sep 1994
TL;DR: The next step needed for integration of data is to provide a common data dictionary with a conceptual schema across the data to mask the many differences that occur when databases are developed independently.
Abstract: Connectivity products are available to connect computers containing data. DBMS vendors are providing gateways into their products, and SQL is being retrofitted on many older DBMSs to make it easier to access data from standard 4GL products and application development systems. The next step needed for integration of data is to provide a common data dictionary with a conceptual schema across the data to mask the many differences that occur when databases are developed independently. Data Integration Inc. has two products for addressing the problem: the InterViso IVBuild for developing the common data dictionary, and InterViso IVQuery for access to diverse databases. IVQuery is a DBMS front-end that allows a user to access data that is managed by existing DBMSs. IVQuery needs the definition of the existing databases, "local schemas" and a view across the databases, the "federated schema", which are stored in a DD/D (data dictionary /directory). The DD/D Builder program, called IVBuild, is used to develop and maintain the DD/D. IVBuild differs from other data modeling tools in that the description of the individual databases is input rather than output. >

Journal Article
TL;DR: Questions about data warehouses and clinical and financial data about their patients stored in them and who should be responsible for it and what sort of training is needed are answered.
Abstract: Soon, most physicians will begin to learn about data warehouses and clinical and financial data about their patients stored in them. What is a data warehouse? Why are we seeing their emergence in health care only now? How does a hospital, or group practice, or health plan acquire or create a data warehouse? Who should be responsible for it, and what sort of training is needed by those in charge of using it for the edification of the sponsoring organization? I'll try to answer these questions in this article.