scispace - formally typeset
Search or ask a question
Topic

Online analytical processing

About: Online analytical processing is a research topic. Over the lifetime, 5042 publications have been published within this topic receiving 92175 citations. The topic is also known as: OLAP.


Papers
More filters
Proceedings Article
25 Aug 1997
TL;DR: A multi-dimensional database model is presented, which is believed to serve as a conceptual model for On-Line Analytical Processing (OLAP)-based applications and it is shown that the data cube operator can be expressed easily.
Abstract: We present a multi-dimensional database model, which we believe can serve as a conceptual model for On-Line Analytical Processing (OLAP)-based applications. Apart from providing the functionalities necessary for OLAP-based applications, the main feature of the model we propose is a clear separation between structural aspects and the contents. This separation of concerns allows us to define data manipulation languages in a reasonably simple, transparent way. In particular, we show that the data cube operator can be expressed easily. Concretely, we define an algebra and a calculus and show them to be equivalent. We conclude by comparing our approach to related work. The conceptual multi-dimensional database model developed here is orthogonal to its implementation, which is not a subject of the present paper.

355 citations

Proceedings ArticleDOI
03 Jun 2002
TL;DR: This paper asks if the traditional relational query acceleration techniques of summary tables and covering indexes have analogs for branching path expression queries over tree- or graph-structured XML data and shows that the forward-and-backward index already proposed in the literature can be viewed as a structure analogous to a summary table or covering index.
Abstract: In this paper, we ask if the traditional relational query acceleration techniques of summary tables and covering indexes have analogs for branching path expression queries over tree- or graph-structured XML data. Our answer is yes --- the forward-and-backward index already proposed in the literature can be viewed as a structure analogous to a summary table or covering index. We also show that it is the smallest such index that covers all branching path expression queries. While this index is very general, our experiments show that it can be so large in practice as to offer little performance improvement over evaluating queries directly on the data. Likening the forward-and-backward index to a covering index on all the attributes of several tables, we devise an index definition scheme to restrict the class of branching path expressions being indexed. The resulting index structures are dramatically smaller and perform better than the full forward-and-backward index for these classes of branching path expressions. This is roughly analogous to the situation in multidimensional or OLAP workloads, in which more highly aggregated summary tables can service a smaller subset of queries but can do so at increased performance. We evaluate the performance of our indexes on both relational decompositions of XML and a native storage technique. As expected, the performance benefit of an index is maximized when the query matches the index definition.

352 citations

Patent
03 Apr 2001
TL;DR: In this article, the authors present an approach to derive OLAP dimensions from the normalized relational table and the results of the OLAP measures derivation by an automated method according to the present invention.
Abstract: A Relational Database Management System (RDBMS) having any arbitrary structure is translated into a multi-dimensional data model suitable for performing OLAP operations upon. If a relational table defining the relational model includes any tables with cardinality of 1,1 or 0,1, the tables are merged into a single table. If the relational table is not normalized, then normalization is performed and a relationship between the original table and the normalized table is created. If the relational table is normalized, but not by dependence between columns, such as in the dimension table in a snowflake schema, the normalization process is performed using the foreign key in order to generate the normalized table. Once the normalized table is generated, OLAP measures are derived from the normalized relational table by an automated method. In addition, OLAP dimensions are derived from the normalized relational table and the results of the OLAP measures derivation by an automated method according to the present invention. According to an aspect, it is possible to associate a member of a dimension to another member of the same or another dimension. According to another aspect, it is possible to create a new dimension of analysis, the members of which are all the different values that a scalar expression can take on. According to yet another aspect, it is possible to access the various instances of a Reporting Object as members in an OLAP dimension. According to the yet another aspect, it is possible to apply opaque filters or a combination of them to the data that underlies analysis.

344 citations

Journal ArticleDOI
TL;DR: This work presents polynomial-time heuristics for a selection of views to optimize total query response time under a disk-space constraint, for some important special cases of the general data warehouse scenario, viz. an AND view graph, where each query/view has a unique evaluation, and extends this heuristic to the general AND-OR view graphs.
Abstract: A data warehouse stores materialized views of data from one or more sources, with the purpose of efficiently implementing decision-support or OLAP queries One of the most important decisions in designing a data warehouse is the selection of materialized views to be maintained at the warehouse The goal is to select an appropriate set of views that minimizes total query response time and the cost of maintaining the selected views, given a limited amount of resource, eg, materialization time, storage space, etc In This work, we have developed a theoretical framework for the general problem of selection of views in a data warehouse We present polynomial-time heuristics for a selection of views to optimize total query response time under a disk-space constraint, for some important special cases of the general data warehouse scenario, viz: 1) an AND view graph, where each query/view has a unique evaluation, eg, when a multiple-query optimizer can be used to general a global evaluation plan for the queries, and 2) an OR view graph, in which any view can be computed from any one of its related views, eg, data cubes We present proofs showing that the algorithms are guaranteed to provide a solution that is fairly close to (within a constant factor ratio of) the optimal solution We extend our heuristic to the general AND-OR view graphs Finally, we address in detail the view-selection problem under the maintenance cost constraint and present provably competitive heuristics

341 citations

Proceedings ArticleDOI
01 Jun 1997
TL;DR: In this article, the summary-delta table method is proposed to maintain summary tables in a data warehouse while minimizing the batch window needed for maintenance, and maintaining a large set of summary tables defined over the same base tables.
Abstract: Data warehouses contain large amounts of information, often collected from a variety of independent sources. Decision-support functions in a warehouse, such as on-line analytical processing (OLAP), involve hundreds of complex aggregate queries over large volumes of data. It is not feasible to compute these queries by scanning the data sets each time. Warehouse applications therefore build a large number of summary tables, or materialized aggregate views, to help them increase the system performance.As changes, most notably new transactional data, are collected at the data sources, all summary tables at the warehouse that depend upon this data need to be updated. Usually, source changes are loaded into the warehouse at regular intervals, usually once a day, in a batch window, and the warehouse is made unavailable for querying while it is updated. Since the number of summary tables that need to be maintained is often large, a critical issue for data warehousing is how to maintain the summary tables efficiently.In this paper we propose a method of maintaining aggregate views (the summary-delta table method), and use it to solve two problems in maintaining summary tables in a warehouse: (1) how to efficiently maintain a summary table while minimizing the batch window needed for maintenance, and (2) how to maintain a large set of summary tables defined over the same base tables.While several papers have addressed the issues relating to choosing and materializing a set of summary tables, this is the first paper to address maintaining summary tables efficiently.

337 citations


Network Information
Related Topics (5)
Web service
57.6K papers, 989K citations
82% related
Ontology (information science)
57K papers, 869.1K citations
80% related
Cluster analysis
146.5K papers, 2.9M citations
80% related
Web page
50.3K papers, 975.1K citations
79% related
Server
79.5K papers, 1.4M citations
79% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202343
2022119
202175
2020144
2019161
2018195