scispace - formally typeset
Search or ask a question
Topic

Online analytical processing

About: Online analytical processing is a research topic. Over the lifetime, 5042 publications have been published within this topic receiving 92175 citations. The topic is also known as: OLAP.


Papers
More filters
Journal ArticleDOI
TL;DR: In this paper , the authors explore the optimization potential of the Cube Algebra Query Language (CAQL) by applying logical rewriting inspired by classic relational algebra and parallelism, and the quality of the query created is evaluated through the observed performance characteristics.
Abstract: A common model used in addressing today's overwhelming amounts of data is the OLAP Cube. The OLAP community has proposed several cube algebras, although a standard has still not been nominated. This study focuses on a recent addition to the cube algebras: the user-centric Cube Algebra Query Language (CAQL). The study aims to explore the optimization potential of this algebra by applying logical rewriting inspired by classic relational algebra and parallelism. The lack of standard algebra is often cited as a problem in such discussions. Thus, the significance of this work is that of strengthening the position of this algebra within the OLAP algebras by addressing implementation details. The modern open-source PostgreSQL relational engine is used to encode the CAQL abstraction. A query workload based on a well-known dataset is adopted, and CAQL and SQL implementations are compared. Finally, the quality of the query created is evaluated through the observed performance characteristics of the query. Results show strong improvements over the baseline case of the unoptimized query.
Posted ContentDOI
20 Feb 2023
TL;DR: In this article , a near-data machine learning framework is proposed to facilitate generating real-time business insight, and predefined change thresholds will trigger online training and deployment of new models, and offers a mixed-format store to guarantee the performance of HTAP workloads.
Abstract: Native database (1) provides a near-data machine learning framework to facilitate generating real-time business insight, and predefined change thresholds will trigger online training and deployment of new models, and (2) offers a mixed-format store to guarantee the performance of HTAP workloads, especially the hybrid workloads that consist of OLAP queries in-between online transactions. We make rigorous test plans for native database with an enhanced state-of-the-art HTAP benchmark.
Posted ContentDOI
19 Jan 2022
TL;DR: In this paper , the authors propose read safe snapshot (RSS) using multiversion CC theory and introduce the RSS construction algorithm utilizing serializable snapshot isolation (SSI) for serializability of HTAP systems.
Abstract: Concurrency Control (CC) ensuring consistency of updated data is an essential element of OLTP systems. Recently, hybrid transactional/analytical processing (HTAP) systems developed for executing OLTP and OLAP have attracted much attention. The OLAP side CC domain has been isolated from OLTP's CC and in many cases has been achieved by snapshot isolation (SI) to establish HTAP systems. Although higher isolation level is ideal, considering OLAP read-only transactions in the context of OLTP scheduling achieving serializability forces aborts/waits and would be a potential performance problem. Furthermore, executing OLAP without affecting OLTP as much as possible is needed for HTAP systems. The aim of this study was serializability without additional aborts/waits. We propose read safe snapshot (RSS) using multiversion CC (MVCC) theory and introduce the RSS construction algorithm utilizing serializable snapshot isolation (SSI). For serializability of HTAP systems, our model makes use of multiversion and allows more schedules with read operations whose corresponding write operations do not participate in the dependency cycles. Furthermore, we implemented the algorithm practically in an open-source database system that offers SSI. Our algorithm was integrated into two types of architecture as HTAP systems called as unified (single-node) or decoupled (multinode) storage architecture. We evaluate the performance and abort rate of the single-node architecture where SSI is applicable. The multi-node architecture was investigated for examining the performance overhead applying our algorithm.
Book ChapterDOI
01 Jan 2023
TL;DR: In this article , the authors explore the area of augmented analytics in the context of BI, through the use of ML and Natural Language Processing (NLP) resources and capabilities, as an innovative paradigm of augmented analytic in the decision-making process.
Abstract: Business intelligence (BI) and analytics are a set of techniques, methodologies, and tools used in the analysis of business data, which allow users (decision makers) to have a clearer view of the market, leveraging the decision-making process, allowing timely business decisions. Usually BI refers to Extract Transform and Load processes (ETL), Data Warehouse (DW), Data mining (DM), online analytical processing (OLAP), visualization tools, and reports. In turn, the analytics generally uses advanced techniques, providing BI users with Artificial Intelligence (AI) and Machine Learning (ML) techniques. In this context, Gartner introduced the term “augmented analytics” in 2017, making the line between BI and advanced analytics clear. The main objective of this work is to explore the area of Augmented Analytics in the context of BI, through the use of ML and Natural Language Processing (NLP) resources and capabilities, as an innovative paradigm of augmented analytics in the decision-making process.
Proceedings ArticleDOI
21 Jul 2015
TL;DR: A method for online analytical processing based on Hive that can not only upload large quantities of data, improving performance of upload and analysis, but also discover knowledge of location and meaningful information hidden in massive warnings from multiple dimensions, providing decision support for network management personnel.
Abstract: For network alarm information analysis in the next generation network existing such problems as single dimension, low efficiency, memory overflow or crash, this paper proposes a method for online analytical processing based on Hive. The method first uses HBase to preprocess real-time alarm data, mapping data into Hive, then constructs an N-D cube model. After proving the scientific and operability by cube calculation, is the construction of a star model, as well as customizing dimensions, each dimension with user-defined hierarchies. Finally HQL is used to realize rollup and cube. This new method is available for both situations whether inter domain devices exist or not. It can not only upload large quantities of data, improving performance of upload and analysis, but also discover knowledge of location and meaningful information hidden in massive warnings from multiple dimensions, providing decision support for network management personnel.

Network Information
Related Topics (5)
Web service
57.6K papers, 989K citations
82% related
Ontology (information science)
57K papers, 869.1K citations
80% related
Cluster analysis
146.5K papers, 2.9M citations
80% related
Web page
50.3K papers, 975.1K citations
79% related
Server
79.5K papers, 1.4M citations
79% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202343
2022119
202175
2020144
2019161
2018195