Showing papers on "Online analytical processing published in 2003"

PDF

Open Access

Patent•

System and method for online analytical processing

[...]

David Greenfield, Geof Fred Lyon, Ron Vogl, Scott A. Feinstein

17 Dec 2003

TL;DR: In this article, a system and method for analyzing data is described, in which an application programming interface (API) is provided to permit an online analytical processing (OLAP) application to manipulate data and queries in model close to the business model the OLAP application was designed to support.

...read moreread less

Abstract: A system and method for analyzing data is described, in which an application programming interface (API) is provided to permit an online analytical processing (OLAP) application to manipulate data and queries in model close to the business model the OLAP application was designed to support. A data server is provided to translated between the object-oriented representation and the native database query format. In one embodiment, a multidimensional virtual cursor is implementation to further simplify the logic of the OLAP application.

...read moreread less

211 citations

Patent•

System and method for management of an automatic OLAP report broadcast system

[...]

Michael J. Saylor¹, Kyle N. Yost, Peter G. Wilding, Robert G. Trenkamp•Institutions (1)

MicroStrategy¹

10 Jan 2003

TL;DR: In this paper, a system, method and processor medium that manages automatic generation of output from an on-line analytical processing system is described, which enables administrator control over processing by enabling administrators to view all services and all subscribers of the system, by maintaining an address book containing entries for subscribers of a service and enabling a system user to view the contents of the address book, and by scheduling processing of services.

...read moreread less

Abstract: A system, method and processor medium that manages automatic generation of output from an on-line analytical processing system. Scheduled services are processed in an on-line analytical processing system and output from the OLAP system is then automatically forwarded to one or more subscriber output devices specified for that service. The system manages the operation of the service processing system to increase throughput, increase speed, and improve the administrator control over the processing. The system may maintain dynamic recipient lists that are resolved by system. The system enables administrator control over processing by enabling administrators to view all services and all subscribers of the system, by maintaining an address book containing entries for subscribers of the service and enables a system user to view the contents of the address book, and by scheduling processing of services. The system governs the volume of services being processed, the number of subscribers to a particular service, and the number of output devices to which a service may be broadcast.

...read moreread less

183 citations

Patent•

OLAP query generation engine

[...]

Lee Eric Kilmer, Richard Scott Bendickson, Thomas Joseph Pavelka, Gary Lynn Jaquier

20 Aug 2003

TL;DR: In this paper, a system and method for generating an On-Line Analytical Processing (OLAP) query is presented, where a query object is provided that supports a plurality of OLAP servers which use different structured query formats.

...read moreread less

Abstract: A system and method for generating an On-Line Analytical Processing (OLAP) query. A query object is provided that supports a plurality of OLAP servers which use different structured query formats. An OLAP server is determined from among the OLAP servers for which the query will be executed based on a property of the query object. The query object is processed to generate a query statement using the structured query format that corresponds to the determined OLAP server.

...read moreread less

176 citations

Proceedings Article•DOI•

QC-trees: an efficient summary structure for semantic OLAP

[...]

Laks V. S. Lakshmanan¹, Jian Pei², Yan Zhao¹•Institutions (2)

University of British Columbia¹, University at Buffalo²

09 Jun 2003

TL;DR: An efficient data structure called QC-tree and an efficient algorithm for directly constructing it from a base table, solving the first problem and giving efficient algorithms that address the remaining questions.

...read moreread less

Abstract: Recently, a technique called quotient cube was proposed as a summary structure for a data cube that preserves its semantics, with applications for online exploration and visualization. The authors showed that a quotient cube can be constructed very efficiently and it leads to a significant reduction in the cube size. While it is an interesting proposal, that paper leaves many issues unaddressed. Firstly, a direct representation of a quotient cube is not as compact as possible and thus still wastes space. Secondly, while a quotient cube can in principle be used for answering queries, no specific algorithms were given in the paper. Thirdly, maintaining any summary structure incrementally against updates is an important task, a topic not addressed there. In this paper, we propose an efficient data structure called QC-tree and an efficient algorithm for directly constructing it from a base table, solving the first problem. We give efficient algorithms that address the remaining questions. We report results from an extensive performance study that illustrate the space and time savings achieved by our algorithms over previous ones (wherever they exist).

...read moreread less

151 citations

Journal Article•DOI•

Multidimensional normal forms for data warehouse design

[...]

Jens Lechtenbörger¹, Gottfried Vossen¹•Institutions (1)

University of Münster¹

01 Jul 2003-Information Systems

TL;DR: A sequence of multidimensional normal forms is established that allow reasoning about the quality of conceptual data warehouse schemata in a rigorous manner and address traditional database design objectives such as faithfulness, completeness, and freedom of redundancies.

...read moreread less

134 citations

Journal Article•DOI•

Materialized view selection as constrained evolutionary optimization

[...]

Jeffrey Xu Yu¹, Xin Yao, Chi-Hon Choi¹, Gang Gou¹•Institutions (1)

The Chinese University of Hong Kong¹

01 Nov 2003

TL;DR: This paper proposes a new constrained evolutionary algorithm that is able to find a near-optimal feasible solution and scales with the problem size well and shows that the constraint handling technique, i.e., stochastic ranking, can deal with constraints effectively.

...read moreread less

Abstract: One of the important issues in data warehouse development is the selection of a set of views to materialize in order to accelerate a large number of on-line analytical processing (OLAP) queries. The maintenance-cost view-selection problem is to select a set of materialized views under certain resource constraints for the purpose of minimizing the total query processing cost. However, the search space for possible materialized views may be exponentially large. A heuristic algorithm often has to be used to find a near optimal solution. In this paper, for the maintenance-cost view-selection problem, we propose a new constrained evolutionary algorithm. Constraints are incorporated into the algorithm through a stochastic ranking procedure. No penalty functions are used. Our experimental results show that the constraint handling technique, i.e., stochastic ranking, can deal with constraints effectively. Our algorithm is able to find a near-optimal feasible solution and scales with the problem size well.

...read moreread less

117 citations

Journal Article•DOI•

Application of data warehouse and Decision Support System in construction management

[...]

Kwok Wing Chau¹, Ying Cao², M. Anson¹, Jian-Ping Zhang²•Institutions (2)

Hong Kong Polytechnic University¹, Tsinghua University²

01 Mar 2003-Automation in Construction

TL;DR: In this article, a data warehouse is integrated with a Decision Support System (DSS) in order to provide information about and insight into the existing data, so as to make decision more efficiently without interrupting the daily work of an On-Line Transaction Processing (OLTP) system.

...read moreread less

109 citations

Book•

Multidimensional Databases: Problems and Solutions

[...]

Maurizio Rafanelli

04 Mar 2003

TL;DR: Multidimensionality in Statistical, OLAP, and Scientific Databases Conceptual Multidimensional Models Hierarchies Operators for Multid dimensional Aggregate Data Time in Multiddimensional Databases Dynamic Multid Dimension Data Cubes Materialized Views in Multiracial Databases Querying MD Data Incomplete Information in MultIDimensional Databases

...read moreread less

Abstract: Multidimensionality in Statistical, OLAP, and Scientific Databases Conceptual Multidimensional Models Hierarchies Operators for Multidimensional Aggregate Data Time in Multidimensional Databases Dynamic Multidimensional Data Cubes Materialized Views in Multidimensional Databases Querying MD Data Incomplete Information in Multidimensional Databases Privacy in Multidimensional Databases Source Integration for Data Warehousing Cooperation with Geographic Databases

...read moreread less

99 citations

Proceedings Article•DOI•

Spatial hierarchy and OLAP-favored search in spatial data warehouse

[...]

Fangyan Rao¹, Long Zhang¹, Xiu Lan Yu¹, Ying Li¹, Ying Chen¹ - Show less +1 more•Institutions (1)

IBM¹

07 Nov 2003

TL;DR: This paper extends the traditional set-grouping hierarchy into multi-dimensional data space and proposes to use spatial index tree as the hierarchy on spatial dimension to improve the performance of spatial OLAP queries and introduces a heuristic search method which can provide an approximate answer to spatialOLAP query.

...read moreread less

Abstract: Data warehouse and Online Analytical Processing(OLAP) play a key role in business intelligent systems. With the increasing amount of spatial data stored in business database, how to utilize these spatial information to get insight into business data from the geo-spatial point of view is becoming an important issue of data warehouse and OLAP. However, traditional data warehouse and OLAP tools can not fully exploit spatial data in coordinates because multi-dimensional spatial data does not have implicit or explicit concept hierarchy to compute pre-aggregation and materialization in data warehouse. In this paper we extend the traditional set-grouping hierarchy into multi-dimensional data space and propose to use spatial index tree as the hierarchy on spatial dimension. With spatial hierarchy, spatial data warehouse can be built accordingly. Our approach preserve the star schema in data warehouse while building the hierarchy on spatial dimension, and can be easily integrated into existing data warehouse and OLAP systems. To process spatial OLAP query in spatial data warehouse, we propose an OLAP-favored search method which can utilize the pre-aggregation result in spatial data warehouse to improve the performance of spatial OLAP queries. For generality, the algorithm is developed based on Generalized Index Searching Tree(GiST). To improve the performance of OLAP-favored search, we further introduce a heuristic search method which can provide an approximate answer to spatial OLAP query. Experiment result shows the efficiency of our method.

...read moreread less

90 citations

Journal Article•DOI•

DocCube: multi-dimensional visualisation and exploration of large document sets

[...]

Josiane Mothe¹, Claude Chrisment², Bernard Dousset², Joel Alaux²•Institutions (2)

Institut Universitaire de Formation des Maîtres¹, Paul Sabatier University²

01 May 2003-Journal of the Association for Information Science and Technology

TL;DR: A novel user interface is presented that provides global visualizations of large document sets in order to help users to formulate the query that corresponds to their information needs and to access the corresponding documents.

...read moreread less

Abstract: This paper presents a novel user interface that provides global visualizations of large document sets in order to help users to formulate the query that corresponds to their information needs and to access the corresponding documents. An important element of the approach we introduce is the use of concept hierarchies (CHs) in order to structure the document collection. Each CH corresponds to a facet of the documents users can be interested in. Users browse these CHs in order to specify and refine their information needs. Additionally the interface is based on OLAP principles and multi-dimensional analysis operators are provided to users in order to allow them to explore a document collection.

...read moreread less

87 citations

Proceedings Article•DOI•

Spreadsheets in RDBMS for OLAP

[...]

Andrew Witkowski, Srikanth Bellamkonda, Tolga Bozkaya, Gregory Dorman, Nathan Folkert, Abhinav Gupta, Lei Shen, Sankar Subramanian¹ - Show less +4 more•Institutions (1)

Oracle Corporation¹

09 Jun 2003

TL;DR: This paper presents SQL extensions involving array based calculations for complex modeling, and presents optimizations, access structures and execution models for processing them efficiently.

...read moreread less

Abstract: One of the critical deficiencies of SQL is lack of support for n-dimensional array-based computations which are frequent in OLAP environments. Relational OLAP (ROLAP) applications have to emulate them using joins, recently introduced SQL Window Functions [18] and complex and inefficient CASE expressions. The designated place in SQL for specifying calculations is the SELECT clause, which is extremely limiting and forces the user to generate queries using nested views, subqueries and complex joins. Furthermore, SQL-query optimizer is pre-occupied with determining efficient join orders and choosing optimal access methods and largely disregards optimization of complex numerical formulas. Execution methods concentrated on efficient computation of a cube [11], [16] rather than on random access structures for inter-row calculations. This has created a gap that has been filled by spreadsheets and specialized MOLAP engines, which are good at formulas for mathematical modeling but lack the formalism of the relational model, are difficult to manage, and exhibit scalability problems. This paper presents SQL extensions involving array based calculations for complex modeling. In addition, we present optimizations, access structures and execution models for processing them efficiently.

...read moreread less

Journal Article•DOI•

Integrating GIS components with knowledge discovery technology for environmental health decision support.

[...]

Yvan Bédard¹, Pierre Gosselin, Sonia Rivest¹, Marie-Josée Proulx¹, Martin Nadeau¹, Germain Lebel, Marie-France Gagnon - Show less +3 more•Institutions (1)

Laval University¹

01 Apr 2003-International Journal of Medical Informatics

TL;DR: This paper presents an example of a SOLAP application in the field of environmental health: the ICEM-SE project and describes the design of this system and how it provides fast and easy access to the detailed and aggregated data that are needed for GKD and decision-making in public health.

...read moreread less

Proceedings Article•DOI•

Multi-dimensional clustering: a new data layout scheme in DB2

[...]

Sriram Padmanabhan¹, Bishwaranjan Bhattacharjee¹, Tim Malkemus¹, Leslie A. Cranston¹, Matthew A. Huras¹ - Show less +1 more•Institutions (1)

IBM¹

09 Jun 2003

TL;DR: The design and implementation of a new data layout scheme, called multi-dimensional clustering, in DB2 Universal Database Version 8.0 is described and novel techniques for maintaining this physical layout efficiently and methods of processing database operations that provide significant performance improvements are described.

...read moreread less

Abstract: We describe the design and implementation of a new data layout scheme, called multi-dimensional clustering, in DB2 Universal Database Version 8. Many applications, e.g., OLAP and data warehousing, process a table or tables in a database using a multi-dimensional access paradigm. Currently, most database systems can only support organization of a table using a primary clustering index. Secondary indexes are created to access the tables when the primary key index is not applicable. Unfortunately, secondary indexes perform many random I/O accesses against the table for a simple operation such as a range query. Our work in multi-dimensional clustering addresses this important deficiency in database systems. Multi-Dimensional Clustering is based on the definition of one or more orthogonal clustering attributes (or expressions) of a table. The table is organized physically by associating records with similar values for the dimension attributes in a cluster. We describe novel techniques for maintaining this physical layout efficiently and methods of processing database operations that provide significant performance improvements. We show results from experiments using a star-schema database to validate our claims of performance with minimal overhead.

...read moreread less

Proceedings Article•DOI•

Towards integrative enterprise knowledge portals

[...]

Torsten Priebe¹, Günther Pernul¹•Institutions (1)

University of Regensburg¹

03 Nov 2003

TL;DR: This paper discusses integration aspects within enterprise knowledge portals and presents an approach for communicating the user context (revealing the user's information need) among portlets, utilizing Semantic Web technologies.

...read moreread less

Abstract: Knowledge portals make an important contribution to enabling enterprise knowledge management by providing users with a consolidated, personalized user interface that allows efficient access to various types of (structured and unstructured) information. Today's portal systems allow combining access modules to different information sources side by side on a single portal webpage. However, there is no interaction between those so called portlets. When a user navigates within one portlet, the others remain unchanged, which means that each source has to be searched individually for relevant information.This paper discusses integration aspects within enterprise knowledge portals and presents an approach for communicating the user context (revealing the user's information need) among portlets, utilizing Semantic Web technologies. For example, the query context of an OLAP portlet, which provides access to structured data stored in a data warehouse, can be used by an information retrieval portlet in order to automatically provide the user with related documents found in the organization's document management system. The paper shortly presents a prototype that we are building to evaluate our approach, demonstrating such an OLAP and information retrieval integration.

...read moreread less

Patent•

Operational reporting architecture

[...]

Gerd Danner, Andreas Wesselmann

01 Dec 2003

TL;DR: In this paper, an architecture and system for integrating online transactional processing (OLTP) systems with OLAP systems is presented, which includes a data access layer having one or more data access programs for accessing OLTP data, a service layer having a business intelligence (BI) platform for generating OLAP data, and a data abstraction layer providing a common meta-model for OTP data integrated with OAP data.

...read moreread less

Abstract: An architecture and system for integrating online transactional processing (OLTP) systems with online analytical processing (OLAP) system. The architecture includes a data access layer having one or more data access programs for accessing OLTP data, a service layer having a business intelligence (BI) platform for generating OLAP data, and a data abstraction layer providing a common meta-model for OLTP data integrated with OLAP data. The architecture further includes a user interface presentation layer configured to provide a user interface for displaying a report run on the integrated OLTP and OLAP data.

...read moreread less

Proceedings Article•DOI•

Handling evolutions in multidimensional structures

[...]

M. Body, Maryvonne Miquel, Yvan Bédard, Anne Tchounikine

05 Mar 2003

TL;DR: This work proposes a novel approach that offers another alternative, allowing to track history but also to compare data, mapped into static structures, and defines a conceptual model and provides possible logical adaptations to implement it on current commercial OLAP systems.

...read moreread less

Abstract: Building multidimensional systems requires gathering data from heterogeneous sources throughout time. Then, data is integrated in multidimensional structures organized around several axes of analysis, or dimensions. But these analysis structures are likely to vary over time and the existing multidimensional models do not (or only partially) take these evolutions into account. Hence, a dilemma appears for the designer of data warehouses: either keeping trace of evolutions therefore limiting the capability of comparison for analysts, or mapping all data in a given version of the structure that entails alteration (or even loss) of data. We propose a novel approach that offers another alternative, allowing to track history but also to compare data, mapped into static structures. We define a conceptual model and provide possible logical adaptations to implement it on current commercial OLAP systems. At last, we present the global architecture that we used for our prototype.

...read moreread less

Proceedings Article•DOI•

Advanced visualization for OLAP

[...]

Andreas S. Maniatis¹, Panos Vassiliadis², Spiros Skiadopoulos¹, Yannis Vassiliou²•Institutions (2)

National Technical University of Athens¹, University of Ioannina²

07 Nov 2003

TL;DR: This paper demonstrates how the Cube Presentation Model (CPM), a novel presentational model for OLAP screens, can be naturally mapped on the Table Lens, which is an advanced visualization technique from the Human-Computer Interaction area, particularly tailored for cross-tab reports.

...read moreread less

Abstract: Data visualization is one of the big issues of database research. OLAP as a decision support technology is highly related to the developments of data visualization area. In this paper we demonstrate how the Cube Presentation Model (CPM), a novel presentational model for OLAP screens, can be naturally mapped on the Table Lens, which is an advanced visualization technique from the Human-Computer Interaction area, particularly tailored for cross-tab reports. We consider how the user interacts with an OLAP screen and based on the particularities of Table Lens, we propose an automated proactive users support. Finally, we discuss the necessity and the applicability of advanced visualization techniques in the presence of recent technological developments.

...read moreread less

Journal Article•DOI•

Parallel ROLAP data cube construction on shared-nothing multiprocessors

[...]

Ying Chen¹, Frank Dehne², Todd Eavis¹, Andrew Rau-Chaplin¹•Institutions (2)

Dalhousie University¹, Carleton University²

22 Apr 2003

TL;DR: This paper presents a parallel method for generating data cubes on a shared-nothing multiprocessor that uses a ROLAP representation of the data cube where views are stored as relational tables and allows for tight integration with current relational database technology.

...read moreread less

Abstract: The pre-computation of data cubes is critical to improving the response time of on-line analytical processing (OLAP) systems and can be instrumental in accelerating data mining tasks in large data warehouses. In order to meet the need for improved performance created by growing data sizes, parallel solutions for generating the data cube are becoming increasingly important. The paper presents a parallel method for generating data cubes on a shared-nothing multiprocessor. Since no (expensive) shared disk is required, our method can be used on low cost Beowulf style clusters consisting of standard PCs with local disks connected via a data switch. Our approach uses a ROLAP representation of the data cube where views are stored as relational tables. This allows for tight integration with current relational database technology. We have implemented our parallel shared-nothing data cube generation method and evaluated it on a PC cluster, exploring relative speedup, local vs. global schedule trees, data skew, cardinality of dimensions, data dimensionality, and balance tradeoffs. For an input data set of 2000000 rows (72 Megabytes), our parallel data cube generation method achieves close to optimal speedup; generating a full data cube of /spl ap/227 million rows (5.6 Gigabytes) on a 16 processors cluster in under 6 minutes. For an input data set of 10,000,000 rows (360 Megabytes), our parallel method, running on a 16 processor PC cluster, created a data cube consisting of /spl ap/846 million rows (21.7 Gigabytes) in under 47 minutes.

...read moreread less

Proceedings Article•DOI•

Ontology-based integration of OLAP and information retrieval

[...]

Torsten Priebe¹, Günther Pernul¹•Institutions (1)

University of Regensburg¹

01 Sep 2003

TL;DR: An ontology-based approach for building an enterprise knowledge portal that integrates OLAP and information retrieval functionality to access both structured data stored in a data warehouse and unstructured data in form of documents is described.

...read moreread less

Abstract: This paper describes an ontology-based approach for building an enterprise knowledge portal that integrates OLAP and information retrieval functionality to access both structured data stored in a data warehouse and unstructured data in form of documents. We discuss how to perform global searches over these information sources. In addition, our approach provides adaptive searching by tracking the user context. When a user performs ad-hoc navigation in an OLAP report, the system will be able to use the query context information to also search for relevant documents.

...read moreread less

Journal Article•DOI•

Efficient OLAP query processing in distributed data warehouses

[...]

Michael O. Akinde¹, Michael H. Böhlen¹, Theodore Johnson², Laks V. S. Lakshmanan³, Divesh Srivastava² - Show less +1 more•Institutions (3)

Aalborg University¹, AT&T Labs², University of British Columbia³

01 Mar 2003-Information Systems

TL;DR: The Skalla system translates OLAP queries, specified as certain algebraic expressions, into distributed evaluation plans which are shipped to individual sites, and proposes a variety of optimizations to minimize both the synchronization traffic and the local processing done at each site.

...read moreread less

Journal Article•DOI•

Equilibrium relations and bipolar cognitive mapping for online analytical processing with applications in international relations and strategic decision support

[...]

Wen-Ran Zhang¹•Institutions (1)

Georgia Southern University¹

01 Apr 2003

TL;DR: This work bridges a gap for CM-based clustering and visualization in OLAP and OLAM by proposing bipolar logic, bipolar sets, and equilibrium relations as formal logical models as well as cognitive models for bipolar cognitive mapping and visualization.

...read moreread less

Abstract: Bipolar logic, bipolar sets, and equilibrium relations are proposed for bipolar cognitive mapping and visualization in online analytical processing (OLAP) and online analytical mining (OLAM). As cognitive models, cognitive maps (CMs) hold great potential for clustering and visualization. Due to the lack of a formal mathematical basis, however, CM-based OLAP and OLAM have not gained popularity. Compared with existing approaches, bipolar cognitive mapping has a number of advantages. First, bipolar CMs are formal logical models as well as cognitive models. Second, equilibrium relations (with polarized reflexivity, symmetry, and transitivity), as bipolar generalizations and fusions of equivalence relations, provide a theoretical basis for bipolar visualization and coordination. Third, an equilibrium relation or CM induces bipolar partitions that distinguish disjoint coalition subsets not involved in any conflict, disjoint coalition subsets involved in a conflict, disjoint conflict subsets, and disjoint harmony subsets. Finally, equilibrium energy analysis leads to harmony and stability measures for strategic decision and multiagent coordination. Thus, this work bridges a gap for CM-based clustering and visualization in OLAP and OLAM. Basic ideas are illustrated with example CMs in international relations.

...read moreread less

Proceedings Article•DOI•

Hand-OLAP: a system for delivering OLAP services on handheld devices

[...]

Alfredo Cuzzocrea, Filippo Furfaro, Domenico Saccà

09 Apr 2003

TL;DR: This paper describes a very effective compression technique for datacubes and the architecture of a system (based on this compression technique), called Hand-OLAP, which allows a handheld device to extract and browse compressed information coming from an OLAP server distributed on a wired network.

...read moreread less

Abstract: The main drawbacks of handheld devices (small storage space, small size of the display screen, discontinuance of the connection to the WLAN, etc.) are often incompatible with the need of querying and browsing information extracted from the enormous amount of data which are accessible through the network. In this application scenario, the issues of compression and summarization of data have a leading role: data in a lossy compressed format can be transmitted more efficiently than original ones, and can be effectively stored in the handheld devices (setting the compression ratio accordingly). In this paper we describe a very effective compression technique for datacubes and the architecture of a system (based on this compression technique), called Hand-OLAP, which allows a handheld device to extract and browse compressed information coming from an OLAP server distributed on a wired network.

...read moreread less

Proceedings Article•DOI•

A quad-tree based multiresolution approach for two-dimensional summary data

[...]

Francesco Buccafurri, Filippo Furfaro, Domenico Saccà, Cristina Sirangelo

09 Jul 2003

TL;DR: This paper restricts its attention to two-dimensional data, which are relevant for a number of applications, and proposes a hierarchical summarization technique, which is combined with the use of indices, i.e. compact structures providing an approximate description of portions of the original data.

...read moreread less

Abstract: In many application contexts, like statistical databases, scientific databases, query optimizers, OLAP, and so on, data are often summarized into synopses of aggregate values. Summarization has the great advantage of saving space, but querying aggregate data rather than the original ones introduces estimation errors which cannot be in general avoided, as summarization is a lossy compression. A central problem in designing summarization techniques is to retain a certain degree of accuracy in reconstructing query answers. In this paper we restrict our attention to two-dimensional data, which are relevant for a number of applications, and propose a hierarchical summarization technique, which is combined with the use of indices, i.e. compact structures providing an approximate description of portions of the original data. Experimental results show that the technique gives approximation errors much smaller than other "general purpose" techniques, such as wavelets and various types of multi-dimensional histogram.

...read moreread less

Journal Article•DOI•

Applying evolutionary algorithms to materialized view selection in a data warehouse

[...]

Jorng-Tzong Horng¹, Y. J. Chang¹, B. J. Liu¹•Institutions (1)

National Central University¹

01 Aug 2003

TL;DR: A genetic algorithm to choose materialized views is presented and experiments are used to demonstrate the power of this approach.

...read moreread less

Abstract: Effective analysis of genome sequences and associated functional data requires access to many different kinds of biological information. A data warehouse [14,16] plays an important role for storage and analysis for genome sequence and functional data. A data warehouse stores lots of materialized views to provide an efficient decision-support or OLAP queries. The view-selection problem addresses to select a fittest set of materialized views from a variety of MVPPs 0 forms a challenge in data warehouse research. In this paper, we present genetic algorithm to choose materialized views. We also use experiments to demonstrate the power of our approach.

...read moreread less

Proceedings Article•DOI•

Hierarchical dwarfs for the rollup cube

[...]

Yannis Sismanis¹, Antonios Deligiannakis¹, Yannis Kotidis², Nick Roussopoulos¹•Institutions (2)

University of Maryland, College Park¹, AT&T Labs²

07 Nov 2003

TL;DR: Extensions to the Dwarf architecture for incorporating rollup data cubes with hierarchical dimensions are presented, showing that the extended Hierarchical Dwarf retains all its advantages both in terms of creation time and space while being able to directly and efficiently support aggregate queries on every level of a dimension's hierarchy.

...read moreread less

Abstract: The data cube operator exemplifies two of the most important aspects of OLAP queries: aggregation and dimension hierarchies. In earlier work we presented Dwarf, a highly compressed and clustered structure for creating, storing and indexing data cubes. Dwarf is a complete architecture that supports queries and updates, while also including a tunable granularity parameter that controls the amount of materialization performed. However, it does not directly support dimension hierarchies. Rollup and drilldown queries on dimension hierarchies that naturally arise in OLAP need to be handled externally and are, thus, very costly. In this paper we present extensions to the Dwarf architecture for incorporating rollup data cubes, i.e. cubes with hierarchical dimensions. We show that the extended Hierarchical Dwarf retains all its advantages both in terms of creation time and space while being able to directly and efficiently support aggregate queries on every level of a dimension's hierarchy.

...read moreread less

Book Chapter•DOI•

Multidimensional Schemas Quality: Assessing and Balancing Analyzability and Simplicity

[...]

Samira Si-Said Cherfi¹, Nicolas Prat²•Institutions (2)

Conservatoire national des arts et métiers¹, ESSEC Business School²

13 Oct 2003

TL;DR: The main objective of the approach is to provide the data warehouse designer with precise measures to support him in the choice among several alternative multidimensional schemas, more specifically on the analyzability and simplicity criteria.

...read moreread less

Abstract: A data warehouse is a database focused on decision making. Decision makers typically access data warehouses through OLAP tools, based on a multidimensional representation of data. In the past, the key issue of data warehouse quality has often been centered on data quality. However, since OLAP tool users directly access multidimensional schemas, multidimensional schema quality evaluation is also crucial. This paper focuses on the quality of multidimensional schemas, more specifically on the analyzability and simplicity criteria. We present the underlying multidimensional model and address the problem of measuring and finding the right balance between analyzability and simplicity of multidimensional schemas. Analyzability and simplicity are assessed using quality metrics which are described and illustrated based on a case study. The main objective of our approach is to provide the data warehouse designer with precise measures to support him in the choice among several alternative multidimensional schemas.

...read moreread less

Solap: a new type of user interface to support spatio-temporal multidimensional data exploration and analysis

[...]

Sonia Rivest, Yvan Bédard¹, Martin Nadeau•Institutions (1)

Laval University¹

01 Jan 2003

TL;DR: This document presents the concepts of SOLAP, the characteristics of this new type of user interface, and examples related to a few of the many possible application domains.

...read moreread less

Abstract: It is well known that transactional and analysis systems each require a different database structure. In general, the database structure of transactional systems is optimized for consistency and efficient updates while the database structure of analysis systems is optimized for complex query performance. Non-spatial data are reorganized in data warehouses in order to support analysis and decision-making. In the same way, spatial data need to be stored in spatial data warehouses to support spatio-temporal decision-making. However, the actual client tools used to exploit the data warehouse are not well adapted to fully exploit the spatial data warehouse. New client tools are then required to take full advantage of the geometric component of the spatial data. GIS are potential candidates but despite interesting spatiotemporal analysis capabilities, it is recognized that actual GIS systems per se are not optimally designed to be used to support decision applications and that alternative solutions should be used (Bedard et al, 2001). Among them, the Spatial OLAP (SOLAP) tools offer promising possibilities. A SOLAP tool can be defined as “a visual platform built especially to support rapid and easy spatio-temporal analysis and exploration of data following a multidimensional approach comprised of aggregation levels available in cartographic displays as well as in tabular and diagram displays” (Bedard, 1997). SOLAP tools form a new family of user interfaces and are meant to be client applications sitting on top of a multi-scale spatial data warehouse. They are based on the multidimensional paradigm. This document presents the concepts of SOLAP, the characteristics of this new type of user interface, and examples related to a few of the many possible application domains. A live demonstration of a SOLAP tool will complete this document.

...read moreread less

Proceedings Article•DOI•

Implementing operations to navigate semantic star schemas

[...]

Alberto Abelló, José Samos, Fèlix Saltor

07 Nov 2003

TL;DR: This paper shows when and how one can change the subject of analysis in the presence of semantic relationships, even if the analysis dimensions do not exactly coincide.

...read moreread less

Abstract: In the last years, lots of work have been devoted to multidimensional modeling, star shape schemas and OLAP operations. However, "drill-across" has not captured as much attention as other operations. This operation allows to change the subject of analysis keeping the same analysis space we were using to analyze another subject. It is assumed that this can be done if both subjects share exactly the same analysis dimensions. In this paper, besides the implementation of an algebraic set of operations on a RDBMS, we are going to show when and how we can change the subject of analysis in the presence of semantic relationships, even if the analysis dimensions do not exactly coincide.

...read moreread less

Proceedings Article•DOI•

The OLAP and data warehousing approaches for analysis and sharing of results from dependability evaluation experiments

[...]

Henrique Madeira¹, João Costa², Marco Vieira²•Institutions (2)

University of Coimbra¹, Polytechnic Institute of Coimbra²

22 Jun 2003

TL;DR: This paper proposes the use of data warehousing technologies to store raw results from different experiments/setups in a common multidimensional structure where raw data can be analyzed and shared world wide by means of web-enabled OLAP (On-Line Analytical Processing) tools.

...read moreread less

Abstract: Two important questions on experimental dependability evaluation remain largely unanswered: 1) how to analyze the usually large amount of raw data produced in dependability evaluation experiments and 2) how to compare results from different experiments or results from similar experiments across different systems. These problems are also common to other dependability evaluation techniques such as the ones based on simulation, or even to the analysis of field data on computer faults. We propose the use of data warehousing technologies to store raw results from different experiments/setups in a common multidimensional structure where raw data can be analyzed and shared world wide by means of web-enabled OLAP (On-Line Analytical Processing) tools. This paper describes how to use the proposed approach in a concrete example of dependability evaluation experiment.

...read moreread less

Journal Article•DOI•

Multi-query optimization for on-line analytical processing

[...]

Panos Kalnis¹, Dimitris Papadias¹•Institutions (1)

Hong Kong University of Science and Technology¹

01 Jul 2003-Information Systems

TL;DR: This paper develops two novel greedy algorithms that construct the execution plan in a top-down manner by identifying in each step the most beneficial view, instead of finding the most promising query.

...read moreread less

Collapse