Topic

Online analytical processing

About: Online analytical processing is a research topic. Over the lifetime, 5042 publications have been published within this topic receiving 92175 citations. The topic is also known as: OLAP.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Wavelet-based histograms for selectivity estimation

[...]

Yossi Matias¹, Jeffrey Scott Vitter², Min Wang²•Institutions (2)

Tel Aviv University¹, Duke University²

01 Jun 1998

TL;DR: This paper presents a technique based upon a multiresolution wavelet decomposition for building histograms on the underlying data distributions, with applications to databases, statistics, and simulation.

...read moreread less

Abstract: Query optimization is an integral part of relational database management systems. One important task in query optimization is selectivity estimation, that is, given a query P, we need to estimate the fraction of records in the database that satisfy P. Many commercial database systems maintain histograms to approximate the frequency distribution of values in the attributes of relations.In this paper, we present a technique based upon a multiresolution wavelet decomposition for building histograms on the underlying data distributions, with applications to databases, statistics, and simulation. Histograms built on the cumulative data distributions give very good approximations with limited space usage. We give fast algorithms for constructing histograms and using them in an on-line fashion for selectivity estimation. Our histograms also provide quick approximate answers to OLAP queries when the exact answers are not required. Our method captures the joint distribution of multiple attributes effectively, even when the attributes are correlated. Experiments confirm that our histograms offer substantial improvements in accuracy over random sampling and other previous approaches.

...read moreread less

464 citations

Proceedings Article•DOI•

An array-based algorithm for simultaneous multidimensional aggregates

[...]

Yihong Zhao¹, Prasad M. Deshpande¹, Jeffrey F. Naughton¹•Institutions (1)

University of Wisconsin-Madison¹

01 Jun 1997

TL;DR: In this paper, a MOLAP algorithm was proposed to compute the Cube operator for multi-dimensional OLAP (MOLAP) systems, which store their data in sparse arrays rather than in tables.

...read moreread less

Abstract: Computing multiple related group-bys and aggregates is one of the core operations of On-Line Analytical Processing (OLAP) applications. Recently, Gray et al. [GBLP95] proposed the “Cube” operator, which computes group-by aggregations over all possible subsets of the specified dimensions. The rapid acceptance of the importance of this operator has led to a variant of the Cube being proposed for the SQL standard. Several efficient algorithms for Relational OLAP (ROLAP) have been developed to compute the Cube. However, to our knowledge there is nothing in the literature on how to compute the Cube for Multidimensional OLAP (MOLAP) systems, which store their data in sparse arrays rather than in tables. In this paper, we present a MOLAP algorithm to compute the Cube, and compare it to a leading ROLAP algorithm. The comparison between the two is interesting, since although they are computing the same function, one is value-based (the ROLAP algorithm) whereas the other is position-based (the MOLAP algorithm). Our tests show that, given appropriate compression techniques, the MOLAP algorithm is significantly faster than the ROLAP algorithm. In fact, the difference is so pronounced that this MOLAP algorithm may be useful for ROLAP systems as well as MOLAP systems, since in many cases, instead of cubing a table directly, it is faster to first convert the table to an array, cube the array, then convert the result back to a table.

...read moreread less

449 citations

Journal Article•DOI•

An Integrated System for Regional Environmental Monitoring and Management Based on Internet of Things

[...]

Shifeng Fang, Li Da Xu, Yunqiang Zhu, Jiaerheng Ahati, Huan Pei¹, Jianwu Yan, Zhihui Liu² - Show less +3 more•Institutions (2)

Yanshan University¹, Xinjiang University²

27 Jan 2014-IEEE Transactions on Industrial Informatics

TL;DR: The study shows that the research work is greatly benefited from such an IIS, not only in data collection supported by IoT, but also in Web services and applications based on cloud computing and e-Science platforms, and the effectiveness of monitoring processes and decision-making can be obviously improved.

...read moreread less

Abstract: Climate change and environmental monitoring and management have received much attention recently, and an integrated information system (IIS) is considered highly valuable. This paper introduces a novel IIS that combines Internet of Things (IoT), Cloud Computing, Geoinformatics [remote sensing (RS), geographical information system (GIS), and global positioning system (GPS)], and e-Science for environmental monitoring and management, with a case study on regional climate change and its ecological effects. Multi-sensors and Web services were used to collect data and other information for the perception layer; both public networks and private networks were used to access and transport mass data and other information in the network layer. The key technologies and tools include real-time operational database (RODB); extraction-transformation-loading (ETL); on-line analytical processing (OLAP) and relational OLAP (ROLAP); naming, addressing, and profile server (NAPS); application gateway (AG); application software for different platforms and tasks (APPs); IoT application infrastructure (IoT-AI); GIS and e-Science platforms; and representational state transfer/Java database connectivity (RESTful/JDBC). Application Program Interfaces (APIs) were implemented in the middleware layer of the IIS. The application layer provides the functions of storing, organizing, processing, and sharing of data and other information, as well as the functions of applications in environmental monitoring and management. The results from the case study show that there is a visible increasing trend of the air temperature in Xinjiang over the last 50 years (1962-2011) and an apparent increasing trend of the precipitation since the early 1980s. Furthermore, from the correlation between ecological indicators [gross primary production (GPP), net primary production (NPP), and leaf area index (LAI)] and meteorological elements (air temperature and precipitation), water resource availability is the decisive factor with regard to the terrestrial ecosystem in the area. The study shows that the research work is greatly benefited from such an IIS, not only in data collection supported by IoT, but also in Web services and applications based on cloud computing and e-Science platforms, and the effectiveness of monitoring processes and decision-making can be obviously improved. This paper provides a prototype IIS for environmental monitoring and management, and it also provides a new paradigm for the future research and practice; especially in the era of big data and IoT.

...read moreread less

443 citations

Proceedings Article•DOI•

A common database approach for OLTP and OLAP using an in-memory column database

[...]

Hasso Plattner¹•Institutions (1)

Hasso Plattner Institute¹

29 Jun 2009

TL;DR: This paper will question some of the fundamentals of the OLAP and OLTP separation and present a new proposal for an enterprise data management concept that will allow for revolutionize transactional applications while providing an optimal platform for analytical data processing.

...read moreread less

Abstract: When SQL and the relational data model were introduced 25 years ago as a general data management concept, enterprise software migrated quickly to this new technology. It is fair to say that SQL and the various implementations of RDBMSs became the backbone of enterprise systems. In those days. we believed that business planning, transaction processing and analytics should reside in one single system. Despite the incredible improvements in computer hardware, high-speed networks, display devices and the associated software, speed and flexibility remained an issue. The nature of RDBMSs, being organized along rows, prohibited us from providing instant analytical insight and finally led to the introduction of so-called data warehouses. This paper will question some of the fundamentals of the OLAP and OLTP separation. Based on the analysis of real customer environments and experience in some prototype implementations, a new proposal for an enterprise data management concept will be presented. In our proposal, the participants in enterprise applications, customers, orders, accounting documents, products, employees etc. will be modeled as objects and also stored and maintained as such. Despite that, the vast majority of business functions will operate on an in memory representation of their objects. Using the relational algebra and a column-based organization of data storage will allow us to revolutionize transactional applications while providing an optimal platform for analytical data processing. The unification of OLTP and OLAP workloads on a shared architecture and the reintegration of planning activities promise significant gains in application development while simplifying enterprise systems drastically. The latest trends in computer technology -- e.g. blade architecture, multiple CPUs per blade with multiple cores per CPU allow for a significant parallelization of application processes. The organization of data in columns supports the parallel use of cores for filtering and aggregation. Elements of application logic can be implemented as highly efficient stored procedures operating on columns. The vast increase in main memory combined with improvements in L1--, L2--, L3--caching, together with the high data compression rate column storage will allow us to support substantial data volumes on one single blade. Distributing data across multiple blades using a shared nothing approach provides further scalability.

...read moreread less

404 citations

Proceedings Article•

Materialized Views Selection in a Multidimensional Database

[...]

Elena Baralis, Stefano Paraboschi, Ernest Teniente

25 Aug 1997

TL;DR: The technique is proposed reduces the soluticn space by considering only the relevant elements of the multidimensional lattice whose elements represent the solution space of the problem.

...read moreread less

Abstract: A multidimensional database is a data repository that supports the efficient execution of complex business decision queries. Query response can be significantly improved by storing an appropriate set of materialized views. These views are selected from the multidimensional lattice whose elements represent the solution space of the problem. Several techniques have been proposed in the past to perform the selection of materialized views for databases with a reduced number of dimensions. When the number and complexity of dimensions increase, the proposed techniques do not scale well. The technique we are proposing reduces the soluticn space by considering only the relevant elements of the multidimensional lattice. An additional statistical analysis allows a further reduction of the solution space.

...read moreread less

396 citations

Collapse

Network Information

Performance

Metrics

5,201

Papers

96,420

Citations

No. of papers in the topic in previous years
Year	Papers
2023	43
2022	119
2021	75
2020	144
2019	161
2018	195

Online analytical processing

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics