scispace - formally typeset
Search or ask a question
Topic

Online analytical processing

About: Online analytical processing is a research topic. Over the lifetime, 5042 publications have been published within this topic receiving 92175 citations. The topic is also known as: OLAP.


Papers
More filters
Proceedings ArticleDOI
07 Apr 1997
TL;DR: The authors give algorithms that automate the selection of summary tables and indexes, and present a family of algorithms of increasing time complexities, and prove strong performance bounds for them.
Abstract: On-line analytical processing (OLAP) is a recent and important application of database systems. Typically, OLAP data is presented as a multidimensional "data cube." OLAP queries are complex and can take many hours or even days to run, if executed directly on the raw data. The most common method of reducing execution time is to precompute some of the queries into summary tables (subcubes of the data cube) and then to build indexes on these summary tables. In most commercial OLAP systems today, the summary tables that are to be precomputed are picked first, followed by the selection of the appropriate indexes on them. A trial-and-error approach is used to divide the space available between the summary tables and the indexes. This two-step process can perform very poorly. Since both summary tables and indexes consume the same resource-space-their selection should be done together for the most efficient use of space. The authors give algorithms that automate the selection of summary tables and indexes. In particular, they present a family of algorithms of increasing time complexities, and prove strong performance bounds for them. The algorithms with higher complexities have better performance bounds. However, the increase in the performance bound is diminishing, and they show that an algorithm of moderate complexity can perform fairly close to the optimal.

545 citations

Proceedings ArticleDOI
01 Jun 1997
TL;DR: A new method whereby multi-dimensional group-by queries, reminiscent of OLAP/Datacube queries but with more flexibility, can be very efficiently performed is introduced.
Abstract: The read-mostly environment of data warehousing makes it possible to use more complex indexes to speed up queries than in situations where concurrent updates are present. The current paper presents a short review of current indexing technology, including row-set representation by Bitmaps, and then introduces two approaches we call Bit-Sliced indexing and Projection indexing. A Projection index materializes all values of a column in RID order, and a Bit-Sliced index essentially takes an orthogonal bit-by-bit view of the same data. While some of these concepts started with the MODEL 204 product, and both Bit-Sliced and Projection indexing are now fully realized in Sybase IQ, this is the first rigorous examination of such indexing capabilities in the literature. We compare algorithms that become feasible with these variant index types against algorithms using more conventional indexes. The analysis demonstrates important performance advantages for variant indexes in some types of SQL aggregation, predicate evaluation, and grouping. The paper concludes by introducing a new method whereby multi-dimensional group-by queries, reminiscent of OLAP/Datacube queries but with more flexibility, can be very efficiently performed.

545 citations

Journal ArticleDOI
Charu C. Aggarwal1, Philip S. Yu1
TL;DR: This paper provides a survey of uncertain data mining and management applications, and discusses different methodologies to process and mine uncertain data in a variety of forms.
Abstract: In recent years, a number of indirect data collection methodologies have lead to the proliferation of uncertain data. Such data points are often represented in the form of a probabilistic function, since the corresponding deterministic value is not known. This increases the challenge of mining and managing uncertain data, since the precise behavior of the underlying data is no longer known. In this paper, we provide a survey of uncertain data mining and management applications. In the field of uncertain data management, we will examine traditional methods such as join processing, query processing, selectivity estimation, OLAP queries, and indexing. In the field of uncertain data mining, we will examine traditional mining problems such as classification and clustering. We will also examine a general transform based technique for mining uncertain data. We discuss the models for uncertain data, and how they can be leveraged in a variety of applications. We discuss different methodologies to process and mine uncertain data in a variety of forms.

497 citations

Book
01 Jan 1999
TL;DR: Database System Implementation focuses on the implementation of database systems, including storage structures, query processing, and transaction management, and provides extensive coverage of query processing.
Abstract: From the Publisher: Three well-known computer scientists at Stanford University-Hector Garcia-Molina, Jeffrey D. Ullman, and Jennifer Widom-have written one of the most comprehensive books on database system implementation. Hector Garcia- Molina pioneered this book at Stanford as a second database systems course for computer science majors and industry-based professionals. It focuses on the implementation of database systems, including storage structures, query processing, and transaction management. Database System Implementation is valuable as an academic textbook or a professional reference. Noteworthy Features Provides extensive coverage of query processing, including major algorithms for execution of queries and techniques for optimizing queries Covers information integration, including warehousing and mediators, OLAP, and data-cube systems Explains error-correction in RAID disks and covers bitmap indexes, data mining, data statistics, and pointer swizzling Supports additional teaching materials found on the book's Web page at ...

479 citations


Network Information
Related Topics (5)
Web service
57.6K papers, 989K citations
82% related
Ontology (information science)
57K papers, 869.1K citations
80% related
Cluster analysis
146.5K papers, 2.9M citations
80% related
Web page
50.3K papers, 975.1K citations
79% related
Server
79.5K papers, 1.4M citations
79% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202343
2022119
202175
2020144
2019161
2018195