scispace - formally typeset
Search or ask a question

Showing papers on "Data mart published in 2005"


Proceedings ArticleDOI
03 Jan 2005
TL;DR: This paper lays the grounds for an automatic generation approach of multidimensional schemes by presenting a set of algebraic operators used to transform automatically the OLAP requirements, specified in the tabular format, to data mart modelled either as star or constellation schemes.
Abstract: Summary form only given. The manual design of data warehouse and data mart schemes can be a tedious, error-prone, and time-consuming task. In addition, it is a highly complex engineering task that calls for methodological support. This paper lays the grounds for an automatic generation approach of multidimensional schemes. It first defines a tabular format for OLAP requirements. Secondly, it presents a set of algebraic operators used to transform automatically the OLAP requirements, specified in the tabular format, to data mart modelled either as star or constellation schemes. Our approach is illustrated with an example.

23 citations


Book ChapterDOI
01 Jan 2005
TL;DR: A framework for leveraging franchise organizational data, information and knowledge assets to acquire and maintain a competitive advantage is proposed and is the culmination of the authors’ years of research and experience in the franchising industry.
Abstract: Franchising has been used by businesses as a growth strategy. Based on the authors’ cumulative research and experience in the industry, this paper describes a comprehensive framework that describes both the franchise environment — from customer services to internal operations — and the pertinent data items in the system. The authors identify the most important aspects of a franchising business, the role of online analytical processing (OLAP) and data mining play and the data items that data mining should focus on to ensure its success. INTRODUCTION Franchising has been popular as a growth strategy for businesses (Justis & Judd, 2002), and its popularity continues to increase in today’s e-commerce-centered global economy. Take Entrepreneur.com, for example. In early 2001, the company included a category called Tech Businesses into its Franchise Zone that contains subcategories of Internet Businesses, Tech Training and Miscellaneous Tech Businesses. At the time of the writing, 27 companies are on the list of the Web site of Entrepreneur.com. 701 E. Chocolate Avenue, Suite 200, Hershey PA 17033-1240, USA Tel: 717/533-8845; Fax 717/533-8661; URL-http://www.idea-group.com IDEA GROUP PUBLISHING This chapter appears in the book, Organizational Data Mining: Leveraging Enterprise Data Resources for Optimal Performance, edited by Hamid R. Nemati and Christopher D. Barko. Copyright © 2004, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited. 218 Chen, Justis & Chong Copyright © 2004, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited. A recent Jupiter Report (2001) recommends using strategic partnerships, such as joint ventures and franchises, to enter e-commerce. A good demonstration of this type of strategic partnership is the online bank, where Juniper’s customers may deposit checks at the franchise chain Mail Boxes Etc. (Porter, 2001). Leaders of other industries also recognize the benefits of such cooperation or symbiosis. For instance, Gates (1999) believes that information technology and business are becoming inextricably interwoven. This nature of integration/interaction among businesses and technologies is especially pertinent in franchise organizations. For example, McDonald’s real moneymaking engine is its little-known real estate business, Franchise Realty Corp. (Love, 1995). This ability to leverage the assets, real estate in this case, of franchise operations, into profitable products or services is at the heart of a successful franchise. Thus, any effort to obtain “meaningful” information in franchise organizations must take this lesson to heart, and a tool for recognizing these meaningful patterns from both internal and external data source can give those in charge the ability to see the big picture without being sidetracked by the tedious process of sifting through mountains of data. Leveraging franchise assets must be built upon sound fundamental practices. Among the many fundamental practices for franchise growth, developing a good relationship between the franchisor and the franchisee is believed to be the most important one (Justis & Judd, 2002). This relationship is developed during the time when a franchisee learns how the business operates. Since all of these elements are learned from working knowledge, thus working knowledge becomes the base of the franchise “family” relationship; and, through the learning process, working knowledge is disseminated throughout the system. The working knowledge is generally accumulated from information that is deciphered from data analyses. In this paper, based on the concept of Digital Nervous system (DNS) suggested by Gates (1999), we propose a framework for leveraging franchise organizational data, information and knowledge assets to acquire and maintain a competitive advantage. This framework is the culmination of the authors’ years of research and experience in the franchising industry. MANAGING FRANCHISE ORGANIZATIONAL DATA According to Gates (1999, p. xviii), a DNS is the digital equivalent of the human nervous system and in corporation that provides information to the right part of the organization at the right time. A DNS “consists of the digital processes that enable a company to perceive and react to its environment, to sense competitor challenges and customer needs, and to organize timely responses,” and “it’s distinguished from a mere network of computers by the accuracy, immediacy, and richness of the information it brings to knowledge workers and the insight and collaboration made possible by the information.” The development of a DNS goes through three phases: (1) Empowerment and Collaboration Phase, (2) Business Intelligence and Knowledge Management Phase, and (3) High Business Value Creation and Implementation Phase. Specifically, the following questions need to be addressed in the franchise industry: 1. How is franchise organizational data being collected, used, renewed, stored, retrieved, transmitted and shared in the Empowerment and Collaboration Phase? 11 more pages are available in the full version of this document, which may be purchased using the "Add to Cart" button on the publisher's webpage: www.igi-global.com/chapter/data-mining-franchiseorganizations/27918

19 citations


Proceedings Article
01 Jan 2005
TL;DR: This paper lays the grounds for an automatic, stepwise approach for the generation of data warehouse and data mart schemes by proposing a standard format for OLAP requirement acquisition and defining an algorithm that transforms automatically the OLAP requirements into data marts modelled either as star or constellation schemes.
Abstract: The Data Warehouse design involves the definition of structures that enable an efficient access to information. The designer builds a multidimensional structure taking into account the users requirements. In fact, it is a highly complex engineering task that calls for a methodological support. This paper lays the grounds for an automatic, stepwise approach for the generation of data warehouse and data mart schemes. For this, it first proposes a standard format for OLAP requirement acquisition. Secondly, it defines an algorithm that transforms automatically the OLAP requirements into data marts modelled either as star or constellation schemes. Thirdly, it overviews our mapping rules between the data sources and the data marts schemes.

19 citations


Book ChapterDOI
01 Jan 2005
TL;DR: Several techniques used in data warehouse to accelerate the OLAP process speed are discussed and a dynamic view management system is used to discuss the techniques of dynamic view selection and view maintenance.
Abstract: Data warehousing and on-line analytical processing (OLAP) is becoming an important tool for decision making in corporations and other organizations. It is one of the main focuses of the database industry. However, the functions and properties of decision support system are rather different from the traditional database application. For example, user of decision support system may be interested in the trend of certain data instead of the actual data itself. Another feature of data warehouse system is that the amount of data inside is tremendous, which means that the traditional query process on these data will be very time consuming. In this survey paper, we will mainly discuss several techniques used in data warehouse to accelerate the OLAP process speed. The rest of paper is organized as follows: Chapter 1 is the introduction, in which we will give an overview of current technology used in the area of data warehousing and OLAP. In chapter 2, we will t alk about a new aggregation operator which is called Data Cube operator. The Data Cube operator can perform N-dimensional aggregation. From chapter 3, we will begin to discuss one of the most importation is issue in data warehousing and OLAP. That is view materialization and view maintenance. In chapter 3, a general introduction to the problems and techniques of materialized views maintenance will be given. In chapter 4, some techniques developed base on the space constrain of data warehouse will be discussed. In chapter 5, we will use a dynamic view management system to discuss the techniques of dynamic view selection and view maintenance.

11 citations


Patent
14 Jul 2005
TL;DR: In this paper, a computer readable memory with a consolidated data mart generator is used to generate a consolidated report based on an analysis of a repository of individual reports, and a report generation tool produces a report via access to the consolidated database.
Abstract: The invention includes a computer readable memory with a consolidated data mart generator to generate a consolidated data mart based upon an analysis of a repository of individual reports. A report generation tool produces a report via access to the consolidated data mart.

9 citations


Book ChapterDOI
01 Jan 2005
TL;DR: A data warehouse (DW) is a collection of technologies aimed at enabling the knowledge worker (executive, manager, analyst, etc.) to make better and faster decisions and exhibits various layers of data in which data from one layer are derived from data of the lower layer.
Abstract: A data warehouse (DW) is a collection of technologies aimed at enabling the knowledge worker (executive, manager, analyst, etc.) to make better and faster decisions. The architecture of a DW exhibits various layers of data in which data from one layer are derived from data of the lower layer (see Figure 1). The operational databases, also called data sources, form the starting layer. They may consist of structured data stored in open database and legacy systems, or even in files. The central layer of the architecture is the global DW. The global DW keeps a historical record of data that result from the transformation, integration, and aggregation of detailed data found in the data sources. An auxiliary area of volatile data, data staging area (DSA) is employed for the purpose of data transformation, reconciliation, and cleaning. The next layer of data involves client warehouses, which contain highly aggregated data, directly derived from the global warehouse. There are various kinds of local warehouses, such as data mart or on-line analytical processing (OLAP) databases, which may use relational database systems or specific multidimensional data structures. The whole environment is described in terms of its components, metadata, and processes in a central metadata repository, located at the DW site. In order to facilitate and manage the DW operational processes, specialized tools are available in the market, under the general title extraction-transformation-loading (ETL) tools. ETL tools are pieces of software responsible for the extraction of data from several sources, their cleansing, customization, and insertion into a DW (see Figure 2). The functionality of these tools includes

8 citations


Journal ArticleDOI
TL;DR: This work proposes a methodology for acquiring and maintaining the necessary knowledge efficiently using data mart and web mining technology and its effectiveness has been shown in an application for a bank's web site.
Abstract: The Internet has become an important medium for effective marketing and efficient operations for many institutions. Visitors of a particular web site leave behind valuable information on their preferences, requirements, and demands regarding the offered products and/or services. Understanding these requirements online, i.e., during a particular visit, is both a difficult technical challenge and a tremendous business opportunity. Web sites that can provide effective online navigation suggestions to their visitors can exploit the potential inherent in the data such visits generate every day. However, identifying, collecting, and maintaining the necessary knowledge that navigation suggestions are based on is far from trivial. We propose a methodology for acquiring and maintaining this knowledge efficiently using data mart and web mining technology. Its effectiveness has been shown in an application for a bank's web site.

7 citations


Proceedings Article
01 Jan 2005
TL;DR: This paper proposes an approach for the automatic generation of data warehouse schema from data mart schemes based on a two-phase integration method defined in terms of a set of rules.
Abstract: This paper proposes an approach for the automatic generation of data warehouse schema from data mart schemes. Our approach integrates multidimensional schemes of data marts (star/constellation) to generate a data warehouse schema. It is based on a two-phase integration method defined in terms of a set of rules. The first phase transforms each multidimensional model into a UML class diagram. The second phase builds the data warehouse schema by integrating the UML class diagrams. The UML class diagram is appropriate to represent the different concepts of the two types of DM/DW models.

6 citations


Proceedings ArticleDOI
19 Sep 2005
TL;DR: This work shows an approach to the analysis of complex log data based on a parallel stream processing architecture and the use of specialized languages, namely a grammatical parser and a logic programming module that offers an efficient, flexible, and powerful solution.
Abstract: Navigation and interaction patterns of Web users can be relatively complex, especially for sites with interactive applications that support user sessions and profiles. We describe such a case for an interactive virtual garment dressing room. The application is distributed over many web sites, supports personnalization and user profiles, and the notion of a multi-site user session. It has its own data logging system that generates approximately 5GB of complex data per month. The analysis of those logs requires more sophisticated processing than is typically done using a relational language. Even the use of procedural languages and DBMS can prove tedious and inefficient. We show an approach to the analysis of complex log data based on a parallel stream processing architecture and the use of specialized languages, namely a grammatical parser and a logic programming module, that offers an efficient, flexible, and powerful solution.

4 citations


Journal ArticleDOI
TL;DR: In this paper, the authors present a case in which the College of Business at a state university in the southern United States has implemented an information system (IS) with a data mart.
Abstract: This paper presents a case in which the College of Business (COB) at a state university in the southern United Sates has implemented an information system (IS) with a data mart. The system has been...

3 citations




Book ChapterDOI
01 Jan 2005
TL;DR: The emphasis of this article is to present a process for executing data-mining projects, and the emphasis is not so much on identifying ways to store data or on consolidating and aggregating data to provide a single, unified perspective, on sifting through large volumes of historical data for new and valuable information that will lead to competitive advantage.
Abstract: In contrast to the Industrial Revolution, the Digital Revolution is happening much more quickly. For example, in 1946, the world’s first programmable computer, the Electronic Numerical Integrator and Computer (ENIAC), stood 10 feet tall, stretched 150 feet wide, cost millions of dollars, and could execute up to 5,000 operations per second. Twentyfive years later, Intel packed 12 times ENIAC’s processing power into a 12–square-millimeter chip. Today’s personal computers with Pentium processors perform in excess of 400 million instructions per second. Database systems, a subfield of computer science, has also met with notable accelerated advances. A major strength of database systems is their ability to store volumes of complex, hierarchical, heterogeneous, and time-variant data and to provide rapid access to information while correctly capturing and reflecting database updates. Together with the advances in database systems, our relationship with data has evolved from the prerelational and relational period to the data-warehouse period. Today, we are in the knowledge-discovery and data-mining (KDDM) period where the emphasis is not so much on identifying ways to store data or on consolidating and aggregating data to provide a single, unified perspective. Rather, the emphasis of KDDM is on sifting through large volumes of historical data for new and valuable information that will lead to competitive advantage. The evolution to KDDM is natural since our capabilities to produce, collect, and store information have grown exponentially. Debit cards, electronic banking, e-commerce transactions, the widespread introduction of bar codes for commercial products, and advances in both mobile technology and remote sensing data-capture devices have all contributed to the mountains of data stored in business, government, and academic databases. Traditional analytical techniques, especially standard query and reporting and online analytical processing, are ineffective in situations involving large amounts of data and where the exact nature of information one wishes to extract is uncertain. Data mining has thus emerged as a class of analytical techniques that go beyond statistics and that aim at examining large quantities of data; data mining is clearly relevant for the current KDDM period. According to Hirji (2001), data mining is the analysis and nontrivial extraction of data from databases for the purpose of discovering new and valuable information, in the form of patterns and rules, from relationships between data elements. Data mining is receiving widespread attention in the academic and public press literature (Berry & Linoff, 2000; Fayyad, Piatetsky-Shapiro, & Smyth, 1996; Kohavi, Rothleder, & Simoudis, 2002; Newton, Kendziorski, Richmond, & Blattner, 2001; Venter, Adams, & Myers, 2001; Zhang, Wang, Ravindranathan, & Miles, 2002), and case studies and anecdotal evidence to date suggest that organizations are increasingly investigating the potential of data-mining technology to deliver competitive advantage. As a multidisciplinary field, data mining draws from many diverse areas such as artificial intelligence, database theory, data visualization, marketing, mathematics, operations research, pattern recognition, and statistics. Research into data mining has thus far focused on developing new algorithms and tools (Dehaspe & Toivonen, 1999; Deutsch, 2003; Jiang, Pei, & Zhang, 2003; Lee, Stolfo, & Mok, 2000; Washio & Motoda, 2003) and on identifying future application areas (Alizadeh et al., 2000; Li, Li, Zhu, & Ogihara, 2002; Page & Craven, 2003; Spangler, May, & Vargas, 1999). As a relatively new field of study, it is not surprising that datamining research is not equally well developed in all areas. To date, no theory-based process model of data mining has emerged. The lack of a formal process model to guide the data-mining effort as well as identification of relevant factors that contribute to effectiveness is becoming more critical as data-mining interest and deployment intensifies. The emphasis of this article is to present a process for executing data-mining projects.

Proceedings Article
01 Jan 2005
TL;DR: This paper defines multidimensional models and defines a set of operators for the validation and the refinement of patterns, covering the addition and deletion of: hierarchy, dimension, dimension attribute, Non Dimension attribute, measure and fact.
Abstract: Designing a decisional system requires a methodology different from those commonly adopted for operational information systems. In our methodology data marts are constructed on the basis of user requirements specified using OLAP design patterns. Since these patterns are independent of any data source, the data mart design process should solve the problems due to differences between user OLAP requirements, from one hand, and data source from the other hand. This paper, first defines multidimensional models. Secondly, it defines a set of operators for the validation and the refinement of patterns, covering the addition and deletion of: hierarchy, dimension, dimension attribute, Non dimension attribute, measure and fact.


Proceedings Article
14 Jul 2005
TL;DR: The analysis, design, and implementation of the data warehouse system for the decisional process based on the Italian train booking data, and, in order to satisfy all the customer's requests, the entire warehouse's data marts concerning the project will be completely re-engineered.
Abstract: The analysis, design, and implementation of the data warehouse system for the decisional process based on the Italian train booking data are presented. Trenitalia, the Italian main train service company, is the customer, and TSF (railway telesystems company) the IT solution provider. In particular, the feasibility requirements, functionality, technical architecture, and product technology are described. Moreover, the guidelines about operational environments interfacing with the data warehouse for data acquisition and elaboration, and related problems are dealt with. With our contribution and the aim of software reuse, the provider has released the prototype system, and, in order to satisfy all the customer's requests, the entire warehouse's data marts concerning the project will be completely re-engineered.