Showing papers on "Data mart published in 2002"
•
01 Jan 2002
TL;DR: This book discusses the evolution of Financial Business Intelligence, the power of Business Intelligence tools, and the challenges faced in implementing and implementing a Business Intelligence System.
Abstract: Foreword. Introduction. Part One: Evolution of Financial Business Intelligence. 1. History and Future of Business Intelligence. History of BI. Trends. 2. Leveraging the Power of Business Intelligence Tools. New Breed of Business Intelligence Tools. 3. Why Consider a Financial BI Tool in Your Organization? Defining the Financial Datawarehouse. What Is Your Company's Business Intelligence Readiness? Part Two: BI Technology. 4. Platforms and Differences in Technology. Financial and Nonfinancially Focused Tools. Who Are the Players? Unix versus Microsoft as a Platform Choice. 5. Performance: Getting Information on Time. Fast Enough for Your Business. Periodicity Considerations. Data Storage Methodology (ROLAP, MOLAP, or HOLAP?). 6. Building the Datawarehouse/Mart. Important Success Factors. Using Financial Data Sources. Staging Data for Financial Analysis. 7. Front-end Analytic Tools. Specialized BI Tools. Real-Time Analysis: Myth or Reality (Technology Perspective). Excel Add-Ins. Traditional Report Writer versus OLAP-Based Analysis Tools. 8. Security. Role-Based Access. Internet Security: Only as Good as Your Barriers. 9. The Internet's Impact on Business Intelligence. Everyone's a Player. What Is a Portal? How Do I Deliver BI Information for the Internet? Part Three: Software Evaluation and Selection. 10. Selecting a Business Intelligence Solution. Create a Plan. Using a Software Selection Company. 11. Software Evaluation: Factors to Consider. Expected Use Now and in the Future. Getting the Rest of the Company to Buy in to the Process. Cost/Benefit Analysis. Return on Investment Analysis. Features and Flexibility. Compatibility with Existing Software. Ease of Use. Software Stability. Vendor-Related Items. Working with an Implementation Partner. How to Select: Summary. 12. Outsourcing: The New Alternative. How It Works. When You Should Consider an Application Service Provider. Selection Criteria for an Application Service Provider. Ensuring Continuous Success. 13. Buyer's Guide. Query and Reporting Systems. Decision Support Systems. OLAP. Enterprise Information Portals. Datawarehouse Software. Extraction, Transformation, and Loading Tools Vendors. eLearning Tools Vendors. Part Four: Implementing a Business Intelligence System. 14. Project Planning. Business Intelligence Project Dos. Business Intelligence Project Don'ts. Drafting and Executing the Project Plan. Sample Project Plan. 15. Datawarehouse or Data Mart? 16. Multidimensional Model Definition. Defining Dimension Hierarchies and Dimension Types. Multidimensional Schemas: Star and Snowflake. 17. Model Maintenance. Periodicity Considerations. Slowly Changing Dimensions. More Help in Maintaining Dimensions. 18. Financial Data Modeling. Data Collection. Extending Your Financial Vision! Balance Sheet. 19. Survey of Datawarehouse Users. The Survey. Analysis of Responses. Appendix A: Sample RFP. Appendix B: Software Candidate Evaluation and Rating Sheet. Appendix C: Sample License Agreement. Appendix D: Sample Confidentiality and Nondisclosure Agreement. (Sales/Demo Process). Appendix E: Sample Support Plan/Agreement. Appendix F: Sample Project Plan. Appendix G: Sample Consulting Agreement. Appendix H: Vendor Addresses. Appendix I: References and Further Reading. Glossary. Index.
48 citations
••
TL;DR: Data warehousing concepts are brought to life through a case study of Harrah’s Entertainment, a firm that became a leader in the gaming industry with its CRM business strategy supported by data warehousing.
Abstract: Data warehousing is a strategic business and IT initiative in many organizations today. Data warehouses can be developed in two alternative ways -- the data mart and the enterprisewide data warehouse strategies -- and each has advantages and disadvantages. To create a data warehouse, data must be extracted from source systems, transformed, and loaded to an appropriate data store. Depending on the business requirements, either relational or multidimensional database technology can be used for the data stores. To provide a multidimensional view of the data using a relational database, a star schema data model is used. Online analytical processing can be performed on both kinds of database technology. Metadata about the data in the warehouse is important for IT and end users. A variety of data access tools and applications can be used with a data warehouse – SQL queries, management reporting systems, managed query environments, DSS/EIS, enterprise intelligence portals, data mining, and customer relationship management. A data warehouse can be used to support a variety of users – executives, managers, analysts, operational personnel, customers, and suppliers. Data warehousing concepts are brought to life through a case study of Harrah’s Entertainment, a firm that became a leader in the gaming industry with its CRM business strategy supported by data warehousing.
45 citations
••
IBM1
TL;DR: This paper describes how one of these operations, the join operation --- probably the most important operation --- is implemented in the IBM Informix Extended Parallel Server (XPS).
Abstract: A star schema is very popular for modeling data warehouses and data marts. Therefore, it is important that a database system which is used for implementing such a data warehouse or data mart is able to efficiently handle operations on such a schema. In this paper we will describe how one of these operations, the join operation --- probably the most important operation --- is implemented in the IBM Informix Extended Parallel Server (XPS).
38 citations
•
TL;DR: The MetaMIS approach is extended to enable the specification of fact calculations in data mart environments and shows how fact calculations can be specified from a management point of view.
Abstract: Based on recent work on the so called MetaMIS approach we show how fact calculations can be specified from a management point of view. The MetaMIS approach's intention is to specify management views on business processes. It comprises a language, a representation formalism and guidelines to define information required for management decisions. Information in general should have pragmatic meaning for the management user. Beyond the task of specifying information in this sense, fact calculations are required to manipulate information. Respective analyzing tasks typically deal with variances, growth rates and other relevant aspects of business processes. We extend the MetaMIS approach to enable the specification of fact calculations in data mart environments.
34 citations
••
01 Jan 2002
TL;DR: This chapter provides a general framework for data warehouse design based on quality in order to aggregate it and customize it with respect to business and organizational criteria required by decision makers.
Abstract: Data warehousing is a new technology which provides a software infrastructure for decision support systems and OLAP applications. Data warehouse systems collect data from heterogeneous and distributed sources, transform and reconcile this data in order to aggregate it and customize it with respect to business and organizational criteria required by decision makers. High level aggregated data is organized by subjects and stored as a multidimensional structure into a data mart. Data quality is very important in database applications in general and very crucial in data warehousing in particular. Indeed, data warehouse systems provide aggregated data to decision makers whose actions and decisions should be very strategic to the enterprise. Providing dirty data, imprecise data or non coherent data may lead to the rejection of the decision support system or may result into non productive decisions. This chapter provides a general framework for data warehouse design based on quality.
29 citations
01 Jan 2002
TL;DR: This paper proposes two approaches to model heterogeneous data in multidimensional structures to support geographic knowledge discovery through data exploration of detailed data for an epoch and of integrated comparable data for timevariant studies.
Abstract: In this paper, we study some problems linked to the integration of data in a spatio-temporal data warehouse. In many cases, the specifications of the data sets have evolved over time, especially when the observed period is large. Under those circumstances, data sources have temporal, spatial and semantic heterogeneity. In order to explore and analyse spatio-temporal data sets in a SOLAP (Spatial On Line Analytical Processing) application, we propose two approaches to model heterogeneous data in multidimensional structures. The first solution consists in a unique temporally integrated cube with all the data of all epochs. The second solution consists in creating a specific cube (data mart) for each specific view that users want to analyse. The final objective is to support geographic knowledge discovery through data exploration of detailed data for an epoch and of integrated comparable data for timevariant studies. Using a practical example in the field of forestry, we evaluate the implementation of these two models. RESUME :
26 citations
•
IBM1
TL;DR: In this article, a method for programmatically deriving street geometry data from address data which is presented in textual format is presented, and a collection of address information is used as input, and is processed in a novel manner to populate tables of a spatially-enabled database.
Abstract: Techniques are disclosed for programmatically deriving street geometry data from address data which is presented in textual format. A collection of address information is used as input, and is processed in a novel manner to populate tables of a spatially-enabled database. Preferred embodiments use a data mart schema which is disclosed, and leverage built-in functions of a spatially-enabled object relational database system. In contrast to prior art techniques, the present invention does not require input data to be encoded in a “well-known” format (i.e. a format that adheres to particular predefined syntax conventions known as “WKT” or “WKB”); rather, textual information of the type which is readily available from government and/or commercial sources may be used as input. The derived street geometry supports retrievals which do not rely on proprietary file formats or binary files, thereby enabling faster retrievals and reduced resource consumption requirements
17 citations
•
26 Apr 2002
TL;DR: YAM^2, a multidimensional conceptual model for OLAP(On-Line Analytical Processing) is proposed as an extension of UML (Unified Modeling Language) to benefit from Object-Oriented concepts and relationships to allow the definition of semantically rich multi-star schemas.
Abstract: ABRSTRACT This thesis proposes YAM^2, a multidimensional conceptual model for OLAP(On-Line Analytical Processing). It is defined as an extension of UML (Unified Modeling Language). The aim is to benefit from Object-Oriented concepts and relationships to allow the definition of semantically rich multi-star schemas. Thus, the usage of Generalization, Association, Derivation, and Flow relationships (in UML terminology) is studied. An architecture based on different levels of schemas is proposed and the characteristics of its different levels defined. The benefits of this architecture are twofold. Firstly, it relates Federated Information Systems with Data Warehousing, so that advances in one area can also be used in the other. Moreover, the Data Mart schemas are defined so that they can be implemented on different Database Management Systems, while still offering a common integrated vision that allows to navigate through the different stars. The main concepts of any multidimensional model are facts and dimensions. Both are analyzed separately, based on the assumption that relationships between aggregation levels are part-whole (or composition) relationships. Thus, mereology axioms are used on that analysis to prove some properties. Besides structures, operations and integrity constraints are also defined for YAM^2. Due to the fact that, in this thesis, a data cube is defined as a function, operations (i.e. Drill-across, ChangeBase, Roll-up, Projection, and Selection) are defined over functions. Regarding the set of integrity constraints, they reflect the importance of summarizability (or aggregability) of measures, and pay special attention to it.
15 citations
••
TL;DR: A case study at a large tertiary care hospital discusses a number of issues that arise in analyzing occupancy data which have implications for design of healthcare operations oriented data warehouses and analysis tools.
Abstract: Managerial decision making problems in the healthcare industry often involve considerations of customer occupancy by time of day and day of week. Through a case study at a large tertiary care hospital, we discuss a number of issues that arise in analyzing occupancy data which have implications for design of healthcare operations oriented data warehouses and analysis tools. We offer practical solutions to these problems including a transaction oriented database design, a general database framework and software tool for analysis of occupancy related data and a method for simulating entity flow from the data mart.
10 citations
•
TL;DR: Analysis of data collected from more than two dozen large companies suggests that the DW concept has been implemented quite differently across enterprises, and that DW practices are still at an early stage of development.
Abstract: While data warehousing (DW) has emerged as a key component of many organizations' information systems, few studies have assessed companies' DW practices. This research examines a range of DW development and management issues. Data collected from more than two dozen large companies suggest that the DW concept has been implemented quite differently across enterprises, and that DW practices are still at an early stage of development. The results are also compared across two different DW architecture types, the hub & spoke and federated data mart approaches. This analysis suggests that the choice of architecture appears to have an important effect on a number of DW development and management measures. The results of this study should be useful to companies looking to initiate or expand their DW operations and to researchers in understanding the current scope and operations of companies' data warehousing efforts.
••
TL;DR: Clinical GeneOrganizer is a novel windows-based archiving, organization and data mining software for the integration of gene expression profiling in clinical medicine and represents a valuable tool for combining gene expression analysis and clinical disease characteristics.
Abstract: Clinical GeneOrganizer (CGO) is a novel windows-based archiving, organization and data mining software for the integration of gene expression profiling in clinical medicine. The program implements various user-friendly tools and extracts data for further statistical analysis. This software was written for Affymetrix GeneChip *.txt files, but can also be used for any other microarray-derived data. The MS-SQL server version acts as a data mart and links microarray data with clinical parameters of any other existing database and therefore represents a valuable tool for combining gene expression analysis and clinical disease characteristics.
•
20 Dec 2002
TL;DR: In this article, a data mart structure and operation supporting system are provided with a data Mart automatic generating function, a web retrieving/report preparing function, and an operation control function so that a specific data base for extracting and working data from a trunk data base can be constructed and operated.
Abstract: PROBLEM TO BE SOLVED: To quickly prepare a data mart corresponding to the request of a user by automatically generating a high speed data mart preparation program by operating selective designation from a screen according to the model of a prepared program structure SOLUTION: This integral data mart structure and operation supporting system are provided with a data mart automatic generating function, a web retrieving/report preparing function, and an operation control function so that a specific data base for extracting and working data from a trunk data base, and for preserving necessary information can be constructed and operated
••
01 Jan 2002TL;DR: In this paper, the economic justification of data warehousing projects is analyzed, and first results from a large academia-industry collaboration project in the field of nontechnical issues of Data Warehousing are presented.
Abstract: Project justification is regarded as one of the major methodological deficits in Data Warehousing practice. As reasons for applying inappropriate methods, performing incomplete evaluations, or even entirely omitting justifications, the special nature of Data Warehousing benefits and the large portion of infrastructure-related activities are stated. In this chapter, the economic justification of Data Warehousing projects is analyzed, and first results from a large academia-industry collaboration project in the field of nontechnical issues of Data Warehousing are presented. As conceptual foundations, the role of the Data Warehouse system in corporate application architectures is analyzed, and the specific properties of Data Warehousing projects are discussed. Based on an applicability analysis of traditional approaches to economic IT project justification, basic steps and responsibilities for the justification of Data Warehousing projects are derived.
••
10 Dec 2002
TL;DR: The data mart creation service moves data from a variety of sources such as ERP system, B2B transaction system, Supply Chain Management system, EDA system and MES databases to build the Turnkey data mart.
Abstract: TSMC turnkey data mart is the collection of databases, designed to help managers make strategic decisions about their business. It focuses on operation management department and aid their managers and planners to get useful information from the system. The data mart creation service moves data from a variety of sources such as ERP system, B2B transaction system, Supply Chain Management system, EDA system and MES databases to build the Turnkey data mart. The data mart creation is used by IT administrators to deliver star or snowflake schemas, which present multi-dimensional models of data warehouses. The dimensional model contains the same information as found in a complex Entity Relationship model, but it makes the information easier to understand, facilitates querying and is resilient to change.
••
28 Oct 2002
•
22 Nov 2002
TL;DR: In this paper, a data warehouse is used to record histories and conditions for the contract form of a service, service utilization and merchandise purchase as customer information in addition to individual information for each customer registered from a channel management system.
Abstract: PROBLEM TO BE SOLVED: To provide a marketing management device managing the contract/ sales conditions of an agency and the support activity conditions of customers on a carrier side and supplying appropriate incentives and marketing information for general customer support to the agency. SOLUTION: A data warehouse 11 records histories and conditions for the contract form of a service, service utilization and merchandise purchase as customer information in addition to individual information for each customer registered from a channel management system 2. A customer value data mart 12 calculates profits brought to a telecommunication carrier by the customer by then by analyzing the customer information recorded in the data warehouse 11 and calculates customer value information for each customer. A marketing management part 13 allocates funds utilizable for customer maintenance for each agency according to the reference of customer management set by a manager on the basis of the calculated customer value information and supplies the information of the funds and the marketing information to the terminal 3 for the agency of each agency.
•
20 Jun 2002
TL;DR: In this article, a method and system for performing real-time transformations of dynamically increasing databases is described, in which a session, identified as a real time session, is initialized.
Abstract: A method and system thereof for performing real time transformations of dynamically increasing databases is described. A session, identified as a real time session, is initialized. The real time session repeatedly executes a persistent (e.g., continually running) data transport pipeline of the analytic application. The data transport pipeline extracts data from a changing database, transforms the data, and writes the transformed data to storage (e.g., a data warehouse or data mart). The data transport pipeline is executed at the end of each time interval in a plurality of contiguous time intervals occurring during the real time session. The data transport pipeline remains running after it is executed, until the real time session is completed. Accordingly, new data are transformed in a timely manner, and processing resources are not consumed by having to repeatedly re-establish (re-initialize) the data transport pipeline.
•
20 Nov 2002
TL;DR: In this article, a system for a distributed remote education using the Internet and a method for managing thereof are provided to embody an education information E-market place capable of performing a reasonable cost calculation and a fit service of education contents through a data mining by adding a common work concept of education information concerned person to a life cycle of education content.
Abstract: PURPOSE: A system for a distributed remote education using the Internet and a method for managing thereof are provided to embody an education information E-market place capable of performing a reasonable cost calculation and a fit service of education contents through a data mining by adding a common work concept of an education information concerned person to a life cycle of education contents. CONSTITUTION: An integrated client supporting server(100) is connected to the Internet through an Internet host. The server(100) includes a web server(110), a servlet engine(120) for managing a servlet interlocked with the web server(110), a JSP(Jackson Structured Programming) engine(130) for managing a JSP, the first server area comprising a Java Beans(140) which is a recycled available component, and the second server area comprising an EJB(Enterprise Java Beans) component such as a Session Bean(160) and an Entity Bean(170) and a server program(150) such as a web application server. A data storing architecture(200) constitutes a data warehouse of a distributed environment and storage such as a data mart. A client(300) includes various Internet devices(310) of an education information user capable of connecting to the integrated client supporting server(100) through the Internet using an Internet communication unit and user interfaces(320,330) for displaying information and charging a two-way communication.
•
TL;DR: Data Mining technology and it requirements on Crime Analysis is introduced and some methods based on a sample crime person Data Mart is shown.
Abstract: Data Mining technology and it requirements on Crime Analysis is introduced. Some methods Data Mining based on a sample crime person Data Mart is shown. A successful mining mode and conclusion are also given.
••
16 May 2002TL;DR: The design framework for a cluster of data marts whose purpose is to provide clinicians and researchers efficient access to a large volume of raw and processed patient images and associated data originating from multiple operational systems over time and spread out across different hospital departments and laboratories is described.
Abstract: The purpose of this paper is to demonstrate the importance of building a brain imaging registry (BIR) on top of existing medical information systems including Picture Archiving Communication Systems (PACS) environment. We describe the design framework for a cluster of data marts whose purpose is to provide clinicians and researchers efficient access to a large volume of raw and processed patient images and associated data originating from multiple operational systems over time and spread out across different hospital departments and laboratories. The framework is designed using object-oriented analysis and design methodology. The BIR data marts each contain complete image and textual data relating to patients with a particular disease.
•
01 Jan 2002
TL;DR: In this article, the authors extend the MetaMIS approach to enable the specification of fact calculations in data mart environments, based on recent work on the so-called meta-mis approach.
Abstract: Based on recent work on the so called MetaMIS approach we show how fact calculations can be specified from a management point of view. The MetaMIS approach's intention is to specify management views on business processes. It comprises a language, a representation formalism and guidelines to define information required for management decisions. Information in general should have pragmatic meaning for the management user. Beyond the task of specifying information in this sense, fact calculations are required to manipulate information. Respective analyzing tasks typically deal with variances, growth rates and other relevant aspects of business processes. We extend the MetaMIS approach to enable the specification of fact calculations in data mart environments.
01 Jan 2002
TL;DR: MF-Retarget is presented, a query retargeting mechanism that deals with both conventional star schemas and multiple facttable (MFT) schemas that is often used to implement a DW using distinct, but interrelated Data Marts.
Abstract: . Performance is a critical issue in Data Warehouse systems (DWs),due to the large amounts of data manipulated, and the type of analysisperformed. A common technique used to improve performance is the use ofpre-computed aggregate data, but the use of aggregates must be transparent forDW users. In this work, we present MF-Retarget, a query retargetingmechanism that deals with both conventional star schemas and multiple facttable (MFT) schemas. This type of multidimensional schema is often used toimplement a DW using distinct, but interrelated Data Marts. The paper presentsthe retargeting algorithm and initial performance tests. 1 Introduction Data warehouses (DW) are analytical databases aimed at providing intuitive access toinformation useful for decision-making processes. A Data Mart (DM), often referredto as a subject-oriented DW, represents a subset of the DW, comprised of relevantdata for a particular business function (e.g. marketing, sales). DW/DM handle largevolumes of data, and they are often designed using a star schema, which containsrelatively few tables and well-defined join paths. On-line Analytical Processing(OLAP) systems are the predominant front-end tools used in DW environments,which typically explore this multidimensional data structure [3, 13]. OLAP operations(e.g. drill down, roll up, slice and dice) typically result in SQL queries in whichaggregation functions (e.g. SUM, COUNT) are applied to fact table attributes, usingdimension table attributes as grouping columns (group by clause).A
••
01 Jan 2002TL;DR: This paper discusses implementation of very large multi-purpose Business Intelligence systems with a focus on system architecture and integration, data modelling and productionalisation of data mining.
Abstract: We discuss implementation of very large multi-purpose Business Intelligence systems. We focus on system architecture and integration, data modelling and productionalisation of data mining. The ideas presented come mostly from the author’s experience with implementations of large Business Intelligence projects at the Bank of Montreal. Some of these projects have won major international awards and recognition for unique, integrated, well performing and highly scalable solutions.
••
01 Jan 2002TL;DR: The enterprise Data Warehouse (DW) is often claimed to be the corporate knowledge repository, but is corporate knowledge derived only from structured information stored within the organization, during its performance?
Abstract: Due to its destination and the sophisticated methods of data processing, the enterprise Data Warehouse (DW) is often claimed to be the corporate knowledge repository. However, is corporate knowledge derived only from structured information (data) stored within the organization, during its performance?