Showing papers on "Data mart published in 2005"

PDF

Open Access

Proceedings Article•DOI•

Automatic construction of multidimensional schema from OLAP requirements

[...]

Ahlem Nabli, Jamel Feki, Faiez Gargouri¹•Institutions (1)

03 Jan 2005

TL;DR: This paper lays the grounds for an automatic generation approach of multidimensional schemes by presenting a set of algebraic operators used to transform automatically the OLAP requirements, specified in the tabular format, to data mart modelled either as star or constellation schemes.

...read moreread less

Abstract: Summary form only given. The manual design of data warehouse and data mart schemes can be a tedious, error-prone, and time-consuming task. In addition, it is a highly complex engineering task that calls for methodological support. This paper lays the grounds for an automatic generation approach of multidimensional schemes. It first defines a tabular format for OLAP requirements. Secondly, it presents a set of algebraic operators used to transform automatically the OLAP requirements, specified in the tabular format, to data mart modelled either as star or constellation schemes. Our approach is illustrated with an example.

...read moreread less

23 citations

Book Chapter•DOI•

Data Mining in Franchise Organizations

[...]

Ye-Sho Chen¹, Bin Zhang¹, Bob Justis¹•Institutions (1)

Louisiana State University¹

01 Jan 2005

TL;DR: A framework for leveraging franchise organizational data, information and knowledge assets to acquire and maintain a competitive advantage is proposed and is the culmination of the authors’ years of research and experience in the franchising industry.

...read moreread less

Abstract: Franchising has been used by businesses as a growth strategy. Based on the authors’ cumulative research and experience in the industry, this paper describes a comprehensive framework that describes both the franchise environment — from customer services to internal operations — and the pertinent data items in the system. The authors identify the most important aspects of a franchising business, the role of online analytical processing (OLAP) and data mining play and the data items that data mining should focus on to ensure its success. INTRODUCTION Franchising has been popular as a growth strategy for businesses (Justis & Judd, 2002), and its popularity continues to increase in today’s e-commerce-centered global economy. Take Entrepreneur.com, for example. In early 2001, the company included a category called Tech Businesses into its Franchise Zone that contains subcategories of Internet Businesses, Tech Training and Miscellaneous Tech Businesses. At the time of the writing, 27 companies are on the list of the Web site of Entrepreneur.com. 701 E. Chocolate Avenue, Suite 200, Hershey PA 17033-1240, USA Tel: 717/533-8845; Fax 717/533-8661; URL-http://www.idea-group.com IDEA GROUP PUBLISHING This chapter appears in the book, Organizational Data Mining: Leveraging Enterprise Data Resources for Optimal Performance, edited by Hamid R. Nemati and Christopher D. Barko. Copyright © 2004, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited. 218 Chen, Justis & Chong Copyright © 2004, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited. A recent Jupiter Report (2001) recommends using strategic partnerships, such as joint ventures and franchises, to enter e-commerce. A good demonstration of this type of strategic partnership is the online bank, where Juniper’s customers may deposit checks at the franchise chain Mail Boxes Etc. (Porter, 2001). Leaders of other industries also recognize the benefits of such cooperation or symbiosis. For instance, Gates (1999) believes that information technology and business are becoming inextricably interwoven. This nature of integration/interaction among businesses and technologies is especially pertinent in franchise organizations. For example, McDonald’s real moneymaking engine is its little-known real estate business, Franchise Realty Corp. (Love, 1995). This ability to leverage the assets, real estate in this case, of franchise operations, into profitable products or services is at the heart of a successful franchise. Thus, any effort to obtain “meaningful” information in franchise organizations must take this lesson to heart, and a tool for recognizing these meaningful patterns from both internal and external data source can give those in charge the ability to see the big picture without being sidetracked by the tedious process of sifting through mountains of data. Leveraging franchise assets must be built upon sound fundamental practices. Among the many fundamental practices for franchise growth, developing a good relationship between the franchisor and the franchisee is believed to be the most important one (Justis & Judd, 2002). This relationship is developed during the time when a franchisee learns how the business operates. Since all of these elements are learned from working knowledge, thus working knowledge becomes the base of the franchise “family” relationship; and, through the learning process, working knowledge is disseminated throughout the system. The working knowledge is generally accumulated from information that is deciphered from data analyses. In this paper, based on the concept of Digital Nervous system (DNS) suggested by Gates (1999), we propose a framework for leveraging franchise organizational data, information and knowledge assets to acquire and maintain a competitive advantage. This framework is the culmination of the authors’ years of research and experience in the franchising industry. MANAGING FRANCHISE ORGANIZATIONAL DATA According to Gates (1999, p. xviii), a DNS is the digital equivalent of the human nervous system and in corporation that provides information to the right part of the organization at the right time. A DNS “consists of the digital processes that enable a company to perceive and react to its environment, to sense competitor challenges and customer needs, and to organize timely responses,” and “it’s distinguished from a mere network of computers by the accuracy, immediacy, and richness of the information it brings to knowledge workers and the insight and collaboration made possible by the information.” The development of a DNS goes through three phases: (1) Empowerment and Collaboration Phase, (2) Business Intelligence and Knowledge Management Phase, and (3) High Business Value Creation and Implementation Phase. Specifically, the following questions need to be addressed in the franchise industry: 1. How is franchise organizational data being collected, used, renewed, stored, retrieved, transmitted and shared in the Empowerment and Collaboration Phase? 11 more pages are available in the full version of this document, which may be purchased using the "Add to Cart" button on the publisher's webpage: www.igi-global.com/chapter/data-mining-franchiseorganizations/27918

...read moreread less

19 citations

Proceedings Article•

Towards an automatic data mart design

[...]

Ahlem Nabli, Ahlem Soussi, Jamel Feki, Hanêne Ben Abdallah, Faiez Gargouri - Show less +1 more

01 Jan 2005

TL;DR: This paper lays the grounds for an automatic, stepwise approach for the generation of data warehouse and data mart schemes by proposing a standard format for OLAP requirement acquisition and defining an algorithm that transforms automatically the OLAP requirements into data marts modelled either as star or constellation schemes.

...read moreread less

Abstract: The Data Warehouse design involves the definition of structures that enable an efficient access to information. The designer builds a multidimensional structure taking into account the users requirements. In fact, it is a highly complex engineering task that calls for a methodological support. This paper lays the grounds for an automatic, stepwise approach for the generation of data warehouse and data mart schemes. For this, it first proposes a standard format for OLAP requirement acquisition. Secondly, it defines an algorithm that transforms automatically the OLAP requirements into data marts modelled either as star or constellation schemes. Thirdly, it overviews our mapping rules between the data sources and the data marts schemes.

...read moreread less

19 citations

Book Chapter•DOI•

Data Warehousing and OLAP

[...]

José Hernández-Orallo¹•Institutions (1)

Polytechnic University of Valencia¹

01 Jan 2005

TL;DR: Several techniques used in data warehouse to accelerate the OLAP process speed are discussed and a dynamic view management system is used to discuss the techniques of dynamic view selection and view maintenance.

...read moreread less

Abstract: Data warehousing and on-line analytical processing (OLAP) is becoming an important tool for decision making in corporations and other organizations. It is one of the main focuses of the database industry. However, the functions and properties of decision support system are rather different from the traditional database application. For example, user of decision support system may be interested in the trend of certain data instead of the actual data itself. Another feature of data warehouse system is that the amount of data inside is tremendous, which means that the traditional query process on these data will be very time consuming. In this survey paper, we will mainly discuss several techniques used in data warehouse to accelerate the OLAP process speed. The rest of paper is organized as follows: Chapter 1 is the introduction, in which we will give an overview of current technology used in the area of data warehousing and OLAP. In chapter 2, we will t alk about a new aggregation operator which is called Data Cube operator. The Data Cube operator can perform N-dimensional aggregation. From chapter 3, we will begin to discuss one of the most importation is issue in data warehousing and OLAP. That is view materialization and view maintenance. In chapter 3, a general introduction to the problems and techniques of materialized views maintenance will be given. In chapter 4, some techniques developed base on the space constrain of data warehouse will be discussed. In chapter 5, we will use a dynamic view management system to discuss the techniques of dynamic view selection and view maintenance.

...read moreread less

11 citations

Patent•

Apparatus and method for creating new reports from a consolidated data mart

[...]

Ju Wu, Ronaldo Ama

14 Jul 2005

TL;DR: In this paper, a computer readable memory with a consolidated data mart generator is used to generate a consolidated report based on an analysis of a repository of individual reports, and a report generation tool produces a report via access to the consolidated database.

...read moreread less

Abstract: The invention includes a computer readable memory with a consolidated data mart generator to generate a consolidated data mart based upon an analysis of a repository of individual reports. A report generation tool produces a report via access to the consolidated data mart.

...read moreread less

9 citations

Book Chapter•DOI•

Extraction-Transformation-Loading Processes.

[...]

Alkis Simitsis¹, Panos Vassiliadis², Timos Sellis¹•Institutions (2)

National Technical University of Athens¹, University of Ioannina²

01 Jan 2005

TL;DR: A data warehouse (DW) is a collection of technologies aimed at enabling the knowledge worker (executive, manager, analyst, etc.) to make better and faster decisions and exhibits various layers of data in which data from one layer are derived from data of the lower layer.

...read moreread less

Abstract: A data warehouse (DW) is a collection of technologies aimed at enabling the knowledge worker (executive, manager, analyst, etc.) to make better and faster decisions. The architecture of a DW exhibits various layers of data in which data from one layer are derived from data of the lower layer (see Figure 1). The operational databases, also called data sources, form the starting layer. They may consist of structured data stored in open database and legacy systems, or even in files. The central layer of the architecture is the global DW. The global DW keeps a historical record of data that result from the transformation, integration, and aggregation of detailed data found in the data sources. An auxiliary area of volatile data, data staging area (DSA) is employed for the purpose of data transformation, reconciliation, and cleaning. The next layer of data involves client warehouses, which contain highly aggregated data, directly derived from the global warehouse. There are various kinds of local warehouses, such as data mart or on-line analytical processing (OLAP) databases, which may use relational database systems or specific multidimensional data structures. The whole environment is described in terms of its components, metadata, and processes in a central metadata repository, located at the DW site. In order to facilitate and manage the DW operational processes, specialized tools are available in the market, under the general title extraction-transformation-loading (ETL) tools. ETL tools are pieces of software responsible for the extraction of data from several sources, their cleansing, customization, and insertion into a DW (see Figure 2). The functionality of these tools includes

...read moreread less

8 citations

Journal Article•DOI•

Acquisition and Maintenance of Knowledge for Online Navigation Suggestions

[...]

Juan D. Velásquez¹, Richard Weber¹, Hiroshi Yasuda¹, Terumasa Aoki¹•Institutions (1)

University of Tokyo¹

01 May 2005-The IEICE transactions on information and systems

TL;DR: This work proposes a methodology for acquiring and maintaining the necessary knowledge efficiently using data mart and web mining technology and its effectiveness has been shown in an application for a bank's web site.

...read moreread less

Abstract: The Internet has become an important medium for effective marketing and efficient operations for many institutions. Visitors of a particular web site leave behind valuable information on their preferences, requirements, and demands regarding the offered products and/or services. Understanding these requirements online, i.e., during a particular visit, is both a difficult technical challenge and a tremendous business opportunity. Web sites that can provide effective online navigation suggestions to their visitors can exploit the potential inherent in the data such visits generate every day. However, identifying, collecting, and maintaining the necessary knowledge that navigation suggestions are based on is far from trivial. We propose a methodology for acquiring and maintaining this knowledge efficiently using data mart and web mining technology. Its effectiveness has been shown in an application for a bank's web site.

...read moreread less

7 citations

Proceedings Article•

A Two-Phase Approach for Multidimensional Schemes Integration.

[...]

Jamel Feki, Jihen Majdoubi, Faiez Gargouri

01 Jan 2005

TL;DR: This paper proposes an approach for the automatic generation of data warehouse schema from data mart schemes based on a two-phase integration method defined in terms of a set of rules.

...read moreread less

Abstract: This paper proposes an approach for the automatic generation of data warehouse schema from data mart schemes. Our approach integrates multidimensional schemes of data marts (star/constellation) to generate a data warehouse schema. It is based on a two-phase integration method defined in terms of a set of rules. The first phase transforms each multidimensional model into a UML class diagram. The second phase builds the data warehouse schema by integrating the UML class diagrams. The UML class diagram is appropriate to represent the different concepts of the two types of DM/DW models.

...read moreread less

6 citations

Proceedings Article•DOI•

Web Log Session Analyzer: Integrating Parsing and Logic Programming into a Data Mart Architecture

[...]

Michel C. Desmarais¹•Institutions (1)

École Polytechnique de Montréal¹

19 Sep 2005

TL;DR: This work shows an approach to the analysis of complex log data based on a parallel stream processing architecture and the use of specialized languages, namely a grammatical parser and a logic programming module that offers an efficient, flexible, and powerful solution.

...read moreread less

Abstract: Navigation and interaction patterns of Web users can be relatively complex, especially for sites with interactive applications that support user sessions and profiles. We describe such a case for an interactive virtual garment dressing room. The application is distributed over many web sites, supports personnalization and user profiles, and the notion of a multi-site user session. It has its own data logging system that generates approximately 5GB of complex data per month. The analysis of those logs requires more sophisticated processing than is typically done using a relational language. Even the use of procedural languages and DBMS can prove tedious and inefficient. We show an approach to the analysis of complex log data based on a parallel stream processing architecture and the use of specialized languages, namely a grammatical parser and a logic programming module, that offers an efficient, flexible, and powerful solution.

...read moreread less

4 citations

Journal Article•DOI•

The Development Of a Data Mart System At a Public Institution

[...]

Kazuo Nakatani¹, Ta-Tao Chuang²•Institutions (2)

Florida Gulf Coast University¹, Gonzaga University²

01 Oct 2005-Journal of information technology case and application research

TL;DR: In this paper, the authors present a case in which the College of Business at a state university in the southern United States has implemented an information system (IS) with a data mart.

...read moreread less

Abstract: This paper presents a case in which the College of Business (COB) at a state university in the southern United Sates has implemented an information system (IS) with a data mart. The system has been...

...read moreread less

3 citations

Journal Article•

Acquisition and Maintenance of Knowledge for Online Navigation Suggestions(Artificial Intelligence and Cognitive Science)

[...]

Juan D. Velásquez, Richard Weber, Hiroshi Yasuda, Terumasa Aoki

01 May 2005-IEICE Transactions on Information and Systems

Book Chapter•DOI•

Data Warehouse Development

[...]

José María Cavero Barca¹, Esperanza Marcos Martínez¹, Mario Piattini², Adolfo Sánchez de Miguel•Institutions (2)

King Juan Carlos University¹, University of Castilla–La Mancha²

01 Jan 2005

Book Chapter•DOI•

Process-Based Data Mining

[...]

Karim K. Hirji

01 Jan 2005

TL;DR: The emphasis of this article is to present a process for executing data-mining projects, and the emphasis is not so much on identifying ways to store data or on consolidating and aggregating data to provide a single, unified perspective, on sifting through large volumes of historical data for new and valuable information that will lead to competitive advantage.

...read moreread less

Abstract: In contrast to the Industrial Revolution, the Digital Revolution is happening much more quickly. For example, in 1946, the world’s first programmable computer, the Electronic Numerical Integrator and Computer (ENIAC), stood 10 feet tall, stretched 150 feet wide, cost millions of dollars, and could execute up to 5,000 operations per second. Twentyfive years later, Intel packed 12 times ENIAC’s processing power into a 12–square-millimeter chip. Today’s personal computers with Pentium processors perform in excess of 400 million instructions per second. Database systems, a subfield of computer science, has also met with notable accelerated advances. A major strength of database systems is their ability to store volumes of complex, hierarchical, heterogeneous, and time-variant data and to provide rapid access to information while correctly capturing and reflecting database updates. Together with the advances in database systems, our relationship with data has evolved from the prerelational and relational period to the data-warehouse period. Today, we are in the knowledge-discovery and data-mining (KDDM) period where the emphasis is not so much on identifying ways to store data or on consolidating and aggregating data to provide a single, unified perspective. Rather, the emphasis of KDDM is on sifting through large volumes of historical data for new and valuable information that will lead to competitive advantage. The evolution to KDDM is natural since our capabilities to produce, collect, and store information have grown exponentially. Debit cards, electronic banking, e-commerce transactions, the widespread introduction of bar codes for commercial products, and advances in both mobile technology and remote sensing data-capture devices have all contributed to the mountains of data stored in business, government, and academic databases. Traditional analytical techniques, especially standard query and reporting and online analytical processing, are ineffective in situations involving large amounts of data and where the exact nature of information one wishes to extract is uncertain. Data mining has thus emerged as a class of analytical techniques that go beyond statistics and that aim at examining large quantities of data; data mining is clearly relevant for the current KDDM period. According to Hirji (2001), data mining is the analysis and nontrivial extraction of data from databases for the purpose of discovering new and valuable information, in the form of patterns and rules, from relationships between data elements. Data mining is receiving widespread attention in the academic and public press literature (Berry & Linoff, 2000; Fayyad, Piatetsky-Shapiro, & Smyth, 1996; Kohavi, Rothleder, & Simoudis, 2002; Newton, Kendziorski, Richmond, & Blattner, 2001; Venter, Adams, & Myers, 2001; Zhang, Wang, Ravindranathan, & Miles, 2002), and case studies and anecdotal evidence to date suggest that organizations are increasingly investigating the potential of data-mining technology to deliver competitive advantage. As a multidisciplinary field, data mining draws from many diverse areas such as artificial intelligence, database theory, data visualization, marketing, mathematics, operations research, pattern recognition, and statistics. Research into data mining has thus far focused on developing new algorithms and tools (Dehaspe & Toivonen, 1999; Deutsch, 2003; Jiang, Pei, & Zhang, 2003; Lee, Stolfo, & Mok, 2000; Washio & Motoda, 2003) and on identifying future application areas (Alizadeh et al., 2000; Li, Li, Zhu, & Ogihara, 2002; Page & Craven, 2003; Spangler, May, & Vargas, 1999). As a relatively new field of study, it is not surprising that datamining research is not equally well developed in all areas. To date, no theory-based process model of data mining has emerged. The lack of a formal process model to guide the data-mining effort as well as identification of relevant factors that contribute to effectiveness is becoming more critical as data-mining interest and deployment intensifies. The emphasis of this article is to present a process for executing data-mining projects.

...read moreread less

Proceedings Article•

Adapting Multidimensional Schemes to Data sources using Algebraic Operators.

[...]

Ahlem Nabli, Jamel Feki, Faiez Gargouri

01 Jan 2005

TL;DR: This paper defines multidimensional models and defines a set of operators for the validation and the refinement of patterns, covering the addition and deletion of: hierarchy, dimension, dimension attribute, Non Dimension attribute, measure and fact.

...read moreread less

Abstract: Designing a decisional system requires a methodology different from those commonly adopted for operational information systems. In our methodology data marts are constructed on the basis of user requirements specified using OLAP design patterns. Since these patterns are independent of any data source, the data mart design process should solve the problems due to differences between user OLAP requirements, from one hand, and data source from the other hand. This paper, first defines multidimensional models. Secondly, it defines a set of operators for the validation and the refinement of patterns, covering the addition and deletion of: hierarchy, dimension, dimension attribute, Non dimension attribute, measure and fact.

...read moreread less

Book Chapter•DOI•

Intelligent Agents for Competitive Advantage

[...]

Mahesh S. Raisinghani¹, John H. Nugent¹•Institutions (1)

University of Dallas¹

01 Jan 2005

Proceedings Article•

Railway data warehousing system for booking data analysis

[...]

Carlo Dell'Aquila, Ezio Lefons, Filippo Tangorra

14 Jul 2005

TL;DR: The analysis, design, and implementation of the data warehouse system for the decisional process based on the Italian train booking data, and, in order to satisfy all the customer's requests, the entire warehouse's data marts concerning the project will be completely re-engineered.

...read moreread less

Abstract: The analysis, design, and implementation of the data warehouse system for the decisional process based on the Italian train booking data are presented. Trenitalia, the Italian main train service company, is the customer, and TSF (railway telesystems company) the IT solution provider. In particular, the feasibility requirements, functionality, technical architecture, and product technology are described. Moreover, the guidelines about operational environments interfacing with the data warehouse for data acquisition and elaboration, and related problems are dealt with. With our contribution and the aim of software reuse, the provider has released the prototype system, and, in order to satisfy all the customer's requests, the entire warehouse's data marts concerning the project will be completely re-engineered.

...read moreread less