scispace - formally typeset
Search or ask a question
Book

Building the data warehouse

01 Jan 1992-
TL;DR: This Second Edition of Building the Data Warehouse is revised and expanded to include new techniques and applications of data warehouse technology and update existing topics to reflect the latest thinking.
Abstract: From the Publisher: The data warehouse solves the problem of getting information out of legacy systems quickly and efficiently. If designed and built right, data warehouses can provide significant freedom of access to data, thereby delivering enormous benefits to any organization. In this unique handbook, W. H. Inmon, "the father of the data warehouse," provides detailed discussion and analysis of all major issues related to the design and construction of the data warehouse, including granularity of data, partitioning data, metadata, lack of creditability of decision support systems (DSS) data, the system of record, migration and more. This Second Edition of Building the Data Warehouse is revised and expanded to include new techniques and applications of data warehouse technology and update existing topics to reflect the latest thinking. It includes a useful review checklist to help evaluate the effectiveness of the design.
Citations
More filters
Journal ArticleDOI
01 Mar 1997
TL;DR: An overview of data warehousing and OLAP technologies, with an emphasis on their new requirements, is provided, based on a tutorial presented at the VLDB Conference, 1996.
Abstract: Data warehousing and on-line analytical processing (OLAP) are essential elements of decision support, which has increasingly become a focus of the database industry. Many commercial products and services are now available, and all of the principal database management system vendors now have offerings in these areas. Decision support places some rather different requirements on database technology compared to traditional on-line transaction processing applications. This paper provides an overview of data warehousing and OLAP technologies, with an emphasis on their new requirements. We describe back end tools for extracting, cleaning and loading data into a data warehouse; multidimensional data models typical of OLAP; front end client tools for querying and data analysis; server extensions for efficient query processing; and tools for metadata management and for managing the warehouse. In addition to surveying the state of the art, this paper also identifies some promising research issues, some of which are related to problems that the database research community has worked on for years, but others are only just beginning to be addressed. This overview is based on a tutorial that the authors presented at the VLDB Conference, 1996.

2,835 citations

01 Jan 2006
TL;DR: There have been many data mining books published in recent years, including Predictive Data Mining by Weiss and Indurkhya [WI98], Data Mining Solutions: Methods and Tools for Solving Real-World Problems by Westphal and Blaxton [WB98], Mastering Data Mining: The Art and Science of Customer Relationship Management by Berry and Linofi [BL99].
Abstract: The book Knowledge Discovery in Databases, edited by Piatetsky-Shapiro and Frawley [PSF91], is an early collection of research papers on knowledge discovery from data. The book Advances in Knowledge Discovery and Data Mining, edited by Fayyad, Piatetsky-Shapiro, Smyth, and Uthurusamy [FPSSe96], is a collection of later research results on knowledge discovery and data mining. There have been many data mining books published in recent years, including Predictive Data Mining by Weiss and Indurkhya [WI98], Data Mining Solutions: Methods and Tools for Solving Real-World Problems by Westphal and Blaxton [WB98], Mastering Data Mining: The Art and Science of Customer Relationship Management by Berry and Linofi [BL99], Building Data Mining Applications for CRM by Berson, Smith, and Thearling [BST99], Data Mining: Practical Machine Learning Tools and Techniques by Witten and Frank [WF05], Principles of Data Mining (Adaptive Computation and Machine Learning) by Hand, Mannila, and Smyth [HMS01], The Elements of Statistical Learning by Hastie, Tibshirani, and Friedman [HTF01], Data Mining: Introductory and Advanced Topics by Dunham, and Data Mining: Multimedia, Soft Computing, and Bioinformatics by Mitra and Acharya [MA03]. There are also books containing collections of papers on particular aspects of knowledge discovery, such as Machine Learning and Data Mining: Methods and Applications edited by Michalski, Brakto, and Kubat [MBK98], and Relational Data Mining edited by Dzeroski and Lavrac [De01], as well as many tutorial notes on data mining in major database, data mining and machine learning conferences.

2,591 citations


Additional excerpts

  • ...[KRRT98], Mastering Data Warehouse Design: Relational and Dimensional Techniques by Imhoff, Galemmo, and Geiger [IGG03], Building the Data Warehouse by Inmon [Inm96], and OLAP Solutions: Building Multidimensional Information Systems by Thomsen [Tho97]....

    [...]

  • ...[Inm96] W....

    [...]

Journal ArticleDOI
TL;DR: This essay aims to help researchers appreciate the levels of artifact abstractions that may be DSR contributions, identify appropriate ways of consuming and producing knowledge when they are preparing journal articles or other scholarly works, and understand and position the knowledge contributions of their research projects.
Abstract: Design science research (DSR) has staked its rightful ground as an important and legitimate Information Systems (IS) research paradigm We contend that DSR has yet to attain its full potential impact on the development and use of information systems due to gaps in the understanding and application of DSR concepts and methods This essay aims to help researchers (1) appreciate the levels of artifact abstractions that may be DSR contributions, (2) identify appropriate ways of consuming and producing knowledge when they are preparing journal articles or other scholarly works, (3) understand and position the knowledge contributions of their research projects, and (4) structure a DSR article so that it emphasizes significant contributions to the knowledge base Our focal contribution is the DSR knowledge contribution framework with two dimensions based on the existing state of knowledge in both the problem and solution domains for the research opportunity under study In addition, we propose a DSR communication schema with similarities to more conventional publication patterns, but which substitutes the description of the DSR artifact in place of a traditional results section We evaluate the DSR contribution framework and the DSR communication schema via examinations of DSR exemplar publications

2,221 citations

Journal ArticleDOI
01 Jun 2002
TL;DR: The evolution of DSS technologies and issues related to DSS definition, application, and impact are discussed, and four powerful decision support tools, including data warehouses, OLAP, data mining, and Web-based DSS are presented.
Abstract: Since the early 1970s, decision support systems (DSS) technology and applications have evolved significantly. Many technological and organizational developments have exerted an impact on this evolution. DSS once utilized more limited database, modeling, and user interface functionality, but technological innovations have enabled far more powerful DSS functionality. DSS once supported individual decision-makers, but later DSS technologies were applied to workgroups or teams, especially virtual teams. The advent of the Web has enabled inter-organizational decision support systems, and has given rise to numerous new applications of existing technology as well as many new decision support technologies themselves. It seems likely that mobile tools, mobile e-services, and wireless Internet protocols will mark the next major set of developments in DSS. This paper discusses the evolution of DSS technologies and issues related to DSS definition, application, and impact. It then presents four powerful decision support tools, including data warehouses, OLAP, data mining, and Web-based DSS. Issues in the field of collaborative support systems and virtual teams are presented. This paper also describes the state of the art of optimization-based decision support and active decision support for the next millennium. Finally, some implications for the future of the field are discussed.

1,360 citations

Journal ArticleDOI
14 Nov 2009
TL;DR: STRIDE's semantic model uses standardized terminologies, such as SNOMED, RxNorm, ICD and CPT, to represent important biomedical concepts and their relationships to create a standards-based informatics platform supporting clinical and translational research.
Abstract: STRIDE (Stanford Translational Research Integrated Database Environment) is a research and development project at Stanford University to create a standards-based informatics platform supporting clinical and translational research STRIDE consists of three integrated components: a clinical data warehouse, based on the HL7 Reference Information Model (RIM), containing clinical information on over 13 million pediatric and adult patients cared for at Stanford University Medical Center since 1995; an application development framework for building research data management applications on the STRIDE platform and a biospecimen data management system STRIDE’s semantic model uses standardized terminologies, such as SNOMED, RxNorm, ICD and CPT, to represent important biomedical concepts and their relationships The system is in daily use at Stanford and is an important component of Stanford University’s CTSA (Clinical and Translational Science Award) Informatics Program

972 citations

References
More filters
Book
01 Mar 1993

49 citations

Book
01 Jan 1992
TL;DR: This work focuses on the development of standards in the Information Engineering Environment and the design of physical database models for this environment.
Abstract: Basic Building Blocks. The Target Architecture. The High-Level Data Model--Entity-Relationship Diagram. The Mid-Level Data Model--Data-Item Set. The Low-Level Data Model--The Physical Model. Generic Data Models. The Process Model. Physical Database Design. Processes and Performance. Recursive Data/Recursive Processing. Data Design and Parallel Processing. Client/Server Processing. The Target Architecture--Some Other Perspectives. Developing Quality Systems. Supporting Software. Standards in the Information Engineering Environment. Organizational Impacts of Information Engineering. Glossary. References. Index.

10 citations