scispace - formally typeset
Search or ask a question
Topic

Online analytical processing

About: Online analytical processing is a research topic. Over the lifetime, 5042 publications have been published within this topic receiving 92175 citations. The topic is also known as: OLAP.


Papers
More filters
Proceedings ArticleDOI
01 Jun 1998
TL;DR: This paper presents a technique based upon a multiresolution wavelet decomposition for building histograms on the underlying data distributions, with applications to databases, statistics, and simulation.
Abstract: Query optimization is an integral part of relational database management systems. One important task in query optimization is selectivity estimation, that is, given a query P, we need to estimate the fraction of records in the database that satisfy P. Many commercial database systems maintain histograms to approximate the frequency distribution of values in the attributes of relations.In this paper, we present a technique based upon a multiresolution wavelet decomposition for building histograms on the underlying data distributions, with applications to databases, statistics, and simulation. Histograms built on the cumulative data distributions give very good approximations with limited space usage. We give fast algorithms for constructing histograms and using them in an on-line fashion for selectivity estimation. Our histograms also provide quick approximate answers to OLAP queries when the exact answers are not required. Our method captures the joint distribution of multiple attributes effectively, even when the attributes are correlated. Experiments confirm that our histograms offer substantial improvements in accuracy over random sampling and other previous approaches.

464 citations

Proceedings ArticleDOI
01 Jun 1997
TL;DR: In this paper, a MOLAP algorithm was proposed to compute the Cube operator for multi-dimensional OLAP (MOLAP) systems, which store their data in sparse arrays rather than in tables.
Abstract: Computing multiple related group-bys and aggregates is one of the core operations of On-Line Analytical Processing (OLAP) applications. Recently, Gray et al. [GBLP95] proposed the “Cube” operator, which computes group-by aggregations over all possible subsets of the specified dimensions. The rapid acceptance of the importance of this operator has led to a variant of the Cube being proposed for the SQL standard. Several efficient algorithms for Relational OLAP (ROLAP) have been developed to compute the Cube. However, to our knowledge there is nothing in the literature on how to compute the Cube for Multidimensional OLAP (MOLAP) systems, which store their data in sparse arrays rather than in tables. In this paper, we present a MOLAP algorithm to compute the Cube, and compare it to a leading ROLAP algorithm. The comparison between the two is interesting, since although they are computing the same function, one is value-based (the ROLAP algorithm) whereas the other is position-based (the MOLAP algorithm). Our tests show that, given appropriate compression techniques, the MOLAP algorithm is significantly faster than the ROLAP algorithm. In fact, the difference is so pronounced that this MOLAP algorithm may be useful for ROLAP systems as well as MOLAP systems, since in many cases, instead of cubing a table directly, it is faster to first convert the table to an array, cube the array, then convert the result back to a table.

449 citations

Journal ArticleDOI
TL;DR: The study shows that the research work is greatly benefited from such an IIS, not only in data collection supported by IoT, but also in Web services and applications based on cloud computing and e-Science platforms, and the effectiveness of monitoring processes and decision-making can be obviously improved.
Abstract: Climate change and environmental monitoring and management have received much attention recently, and an integrated information system (IIS) is considered highly valuable. This paper introduces a novel IIS that combines Internet of Things (IoT), Cloud Computing, Geoinformatics [remote sensing (RS), geographical information system (GIS), and global positioning system (GPS)], and e-Science for environmental monitoring and management, with a case study on regional climate change and its ecological effects. Multi-sensors and Web services were used to collect data and other information for the perception layer; both public networks and private networks were used to access and transport mass data and other information in the network layer. The key technologies and tools include real-time operational database (RODB); extraction-transformation-loading (ETL); on-line analytical processing (OLAP) and relational OLAP (ROLAP); naming, addressing, and profile server (NAPS); application gateway (AG); application software for different platforms and tasks (APPs); IoT application infrastructure (IoT-AI); GIS and e-Science platforms; and representational state transfer/Java database connectivity (RESTful/JDBC). Application Program Interfaces (APIs) were implemented in the middleware layer of the IIS. The application layer provides the functions of storing, organizing, processing, and sharing of data and other information, as well as the functions of applications in environmental monitoring and management. The results from the case study show that there is a visible increasing trend of the air temperature in Xinjiang over the last 50 years (1962-2011) and an apparent increasing trend of the precipitation since the early 1980s. Furthermore, from the correlation between ecological indicators [gross primary production (GPP), net primary production (NPP), and leaf area index (LAI)] and meteorological elements (air temperature and precipitation), water resource availability is the decisive factor with regard to the terrestrial ecosystem in the area. The study shows that the research work is greatly benefited from such an IIS, not only in data collection supported by IoT, but also in Web services and applications based on cloud computing and e-Science platforms, and the effectiveness of monitoring processes and decision-making can be obviously improved. This paper provides a prototype IIS for environmental monitoring and management, and it also provides a new paradigm for the future research and practice; especially in the era of big data and IoT.

443 citations

Proceedings ArticleDOI
29 Jun 2009
TL;DR: This paper will question some of the fundamentals of the OLAP and OLTP separation and present a new proposal for an enterprise data management concept that will allow for revolutionize transactional applications while providing an optimal platform for analytical data processing.
Abstract: When SQL and the relational data model were introduced 25 years ago as a general data management concept, enterprise software migrated quickly to this new technology. It is fair to say that SQL and the various implementations of RDBMSs became the backbone of enterprise systems. In those days. we believed that business planning, transaction processing and analytics should reside in one single system. Despite the incredible improvements in computer hardware, high-speed networks, display devices and the associated software, speed and flexibility remained an issue. The nature of RDBMSs, being organized along rows, prohibited us from providing instant analytical insight and finally led to the introduction of so-called data warehouses. This paper will question some of the fundamentals of the OLAP and OLTP separation. Based on the analysis of real customer environments and experience in some prototype implementations, a new proposal for an enterprise data management concept will be presented. In our proposal, the participants in enterprise applications, customers, orders, accounting documents, products, employees etc. will be modeled as objects and also stored and maintained as such. Despite that, the vast majority of business functions will operate on an in memory representation of their objects. Using the relational algebra and a column-based organization of data storage will allow us to revolutionize transactional applications while providing an optimal platform for analytical data processing. The unification of OLTP and OLAP workloads on a shared architecture and the reintegration of planning activities promise significant gains in application development while simplifying enterprise systems drastically. The latest trends in computer technology -- e.g. blade architecture, multiple CPUs per blade with multiple cores per CPU allow for a significant parallelization of application processes. The organization of data in columns supports the parallel use of cores for filtering and aggregation. Elements of application logic can be implemented as highly efficient stored procedures operating on columns. The vast increase in main memory combined with improvements in L1--, L2--, L3--caching, together with the high data compression rate column storage will allow us to support substantial data volumes on one single blade. Distributing data across multiple blades using a shared nothing approach provides further scalability.

404 citations

Proceedings Article
25 Aug 1997
TL;DR: The technique is proposed reduces the soluticn space by considering only the relevant elements of the multidimensional lattice whose elements represent the solution space of the problem.
Abstract: A multidimensional database is a data repository that supports the efficient execution of complex business decision queries. Query response can be significantly improved by storing an appropriate set of materialized views. These views are selected from the multidimensional lattice whose elements represent the solution space of the problem. Several techniques have been proposed in the past to perform the selection of materialized views for databases with a reduced number of dimensions. When the number and complexity of dimensions increase, the proposed techniques do not scale well. The technique we are proposing reduces the soluticn space by considering only the relevant elements of the multidimensional lattice. An additional statistical analysis allows a further reduction of the solution space.

396 citations


Network Information
Related Topics (5)
Web service
57.6K papers, 989K citations
82% related
Ontology (information science)
57K papers, 869.1K citations
80% related
Cluster analysis
146.5K papers, 2.9M citations
80% related
Web page
50.3K papers, 975.1K citations
79% related
Server
79.5K papers, 1.4M citations
79% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202343
2022119
202175
2020144
2019161
2018195