scispace - formally typeset
Search or ask a question

Showing papers on "Online analytical processing published in 2011"


Proceedings ArticleDOI
11 Apr 2011
TL;DR: This work presents an efficient hybrid system, called HyPer, that can handle both OLTP and OLAP simultaneously by using hardware-assisted replication mechanisms to maintain consistent snapshots of the transactional data.
Abstract: The two areas of online transaction processing (OLTP) and online analytical processing (OLAP) present different challenges for database architectures. Currently, customers with high rates of mission-critical transactions have split their data into two separate systems, one database for OLTP and one so-called data warehouse for OLAP. While allowing for decent transaction rates, this separation has many disadvantages including data freshness issues due to the delay caused by only periodically initiating the Extract Transform Load-data staging and excessive resource consumption due to maintaining two separate information systems. We present an efficient hybrid system, called HyPer, that can handle both OLTP and OLAP simultaneously by using hardware-assisted replication mechanisms to maintain consistent snapshots of the transactional data. HyPer is a main-memory database system that guarantees the ACID properties of OLTP transactions and executes OLAP query sessions (multiple queries) on the same, arbitrarily current and consistent snapshot. The utilization of the processor-inherent support for virtual memory management (address translation, caching, copy on update) yields both at the same time: unprecedentedly high transaction rates as high as 100000 per second and very fast OLAP query response times on a single system executing both workloads in parallel. The performance analysis is based on a combined TPC-C and TPC-H benchmark.

674 citations


Proceedings ArticleDOI
28 Oct 2011
TL;DR: This paper provides an overview of state-of-the-art research issues and achievements in the field of analytics over big data, and extends the discussion to Analytics over big multidimensional data as well, by highlighting open problems and actual research trends.
Abstract: In this paper, we provide an overview of state-of-the-art research issues and achievements in the field of analytics over big data, and we extend the discussion to analytics over big multidimensional data as well, by highlighting open problems and actual research trends. Our analytical contribution is finally completed by several novel research directions arising in this field, which plays a leading role in next-generation Data Warehousing and OLAP research.

321 citations


Proceedings ArticleDOI
12 Jun 2011
TL;DR: Graph Cube is introduced, a new data warehousing model that supports OLAP queries effectively on large multidimensional networks and is shown to be a powerful and efficient tool for decision support on large multi-dimensional networks.
Abstract: We consider extending decision support facilities toward large sophisticated networks, upon which multidimensional attributes are associated with network entities, thereby forming the so-called multidimensional networks. Data warehouses and OLAP (Online Analytical Processing) technology have proven to be effective tools for decision support on relational data. However, they are not well-equipped to handle the new yet important multidimensional networks. In this paper, we introduce Graph Cube, a new data warehousing model that supports OLAP queries effectively on large multidimensional networks. By taking account of both attribute aggregation and structure summarization of the networks, Graph Cube goes beyond the traditional data cube model involved solely with numeric value based group-by's, thus resulting in a more insightful and structure-enriched aggregate network within every possible multidimensional space. Besides traditional cuboid queries, a new class of OLAP queries, crossboid, is introduced that is uniquely useful in multidimensional networks and has not been studied before. We implement Graph Cube by combining special characteristics of multidimensional networks with the existing well-studied data cube techniques. We perform extensive experimental studies on a series of real world data sets and Graph Cube is shown to be a powerful and efficient tool for decision support on large multidimensional networks.

179 citations


Proceedings ArticleDOI
13 Jun 2011
TL;DR: The definition of a new, complex, mixed workload benchmark, called mixed workload CH-benCHmark, which bridges the gap between the established single-workload suites of TPC-C for OLTP and T PC-H for OLAP, and executes a complex mixed workload.
Abstract: While standardized and widely used benchmarks address either operational or real-time Business Intelligence (BI) workloads, the lack of a hybrid benchmark led us to the definition of a new, complex, mixed workload benchmark, called mixed workload CH-benCHmark. This benchmark bridges the gap between the established single-workload suites of TPC-C for OLTP and TPC-H for OLAP, and executes a complex mixed workload: a transactional workload based on the order entry processing of TPC-C and a corresponding TPC-H-equivalent OLAP query suite run in parallel on the same tables in a single database system. As it is derived from these two most widely used TPC benchmarks, the CH-benCHmark produces results highly relevant to both hybrid and classic single-workload systems.

133 citations


Journal ArticleDOI
TL;DR: A computation and storage efficient algorithm for estimating equation (EE) estimation in massive data sets using a “divide-and-conquer” strategy that is strongly consistent and asymptotically equivalent to the EE estimator.
Abstract: Motivated by the recent active research on online analytical processing (OLAP), we develop a computation and storage efficient algorithm for estimating equation (EE) estimation in massive data sets using a “divide-and-conquer” strategy. In each partition of the data set, we compress the raw data into some low dimensional statistics and then discard the raw data. Then, we obtain an approximation to the EE estimator, the aggregated EE (AEE) estimator, by solving an equation aggregated from the saved low dimensional statistics in all partitions. Such low dimensional statistics are taken as the EE estimates and first-order derivatives of the estimating equations in each partition. We show that, under proper partitioning and some regularity conditions, the AEE estimator is strongly consistent and asymptotically equivalent to the EE estimator. A major application of the AEE technique is to support fast OLAP of EE estimations for data warehousing technologies such as data cubes and data streams. It can also be used to reduce the computation time and conquer the memory constraint problem posed by massive data sets. Simulation studies show that the AEE estimator provides efficient storage and remarkable deduction in computational time, especially in its applications to data cubes and data streams.

129 citations


Journal ArticleDOI
TL;DR: This work proposes two categories of novel anonymization methods based on approximate nearest-neighbor (NN) search in high-dimensional spaces, which is efficiently performed through locality-sensitive hashing (LSH) and two data transformations that capture the correlation in the underlying data: reduction to a band matrix and Gray encoding-based sorting.
Abstract: Existing research on privacy-preserving data publishing focuses on relational data: in this context, the objective is to enforce privacy-preserving paradigms, such as k-anonymity and l-diversity, while minimizing the information loss incurred in the anonymizing process (i.e., maximize data utility). Existing techniques work well for fixed-schema data, with low dimensionality. Nevertheless, certain applications require privacy-preserving publishing of transactional data (or basket data), which involve hundreds or even thousands of dimensions, rendering existing methods unusable. We propose two categories of novel anonymization methods for sparse high-dimensional data. The first category is based on approximate nearest-neighbor (NN) search in high-dimensional spaces, which is efficiently performed through locality-sensitive hashing (LSH). In the second category, we propose two data transformations that capture the correlation in the underlying data: 1) reduction to a band matrix and 2) Gray encoding-based sorting. These representations facilitate the formation of anonymized groups with low information loss, through an efficient linear-time heuristic. We show experimentally, using real-life data sets, that all our methods clearly outperform existing state of the art. Among the proposed techniques, NN-search yields superior data utility compared to the band matrix transformation, but incurs higher computational overhead. The data transformation based on Gray code sorting performs best in terms of both data utility and execution time.

97 citations


Proceedings ArticleDOI
11 Apr 2011
TL;DR: ES2 - the elastic data storage system of epiC, which is designed to support both functionalities within the same storage, and experimental results which demonstrate the efficiency of the system.
Abstract: Cloud computing represents a paradigm shift driven by the increasing demand of Web based applications for elastic, scalable and efficient system architectures that can efficiently support their ever-growing data volume and large-scale data analysis. A typical data management system has to deal with real-time updates by individual users, and as well as periodical large scale analytical processing, indexing, and data extraction. While such operations may take place in the same domain, the design and development of the systems have somehow evolved independently for transactional and periodical analytical processing. Such a system-level separation has resulted in problems such as data freshness as well as serious data storage redundancy. Ideally, it would be more efficient to apply ad-hoc analytical processing on the same data directly. However, to the best of our knowledge, such an approach has not been adopted in real implementation. Intrigued by such an observation, we have designed and implemented epiC, an elastic power-aware data-itensive Cloud platform for supporting both data intensive analytical operations (ref. as OLAP) and online transactions (ref. as OLTP). In this paper, we present ES2 - the elastic data storage system of epiC, which is designed to support both functionalities within the same storage. We present the system architecture and the functions of each system component, and experimental results which demonstrate the efficiency of the system.

96 citations


Proceedings ArticleDOI
12 Jun 2011
TL;DR: This work proposes a novel E-Cube model which combines CEP and OLAP techniques for efficient multi-dimensional event pattern analysis at different abstraction levels, and designs a cost-driven adaptive optimizer called Chase, that exploits the above reuse strategies for optimal E- Cube hierarchy execution.
Abstract: Many modern applications, including online financial feeds, tag-based mass transit systems and RFID-based supply chain management systems transmit real-time data streams. There is a need for event stream processing technology to analyze this vast amount of sequential data to enable online operational decision making. Existing techniques such as traditional online analytical processing (OLAP) systems are not designed for real-time pattern-based operations, while state-of-the-art Complex Event Processing (CEP) systems designed for sequence detection do not support OLAP operations. We propose a novel E-Cube model which combines CEP and OLAP techniques for efficient multi-dimensional event pattern analysis at different abstraction levels. Our analysis of the interrelationships in both concept abstraction and pattern refinement among queries facilitates the composition of these queries into an integrated E-Cube hierarchy. Based on this E-Cube hierarchy, strategies of drill-down (refinement from abstract to more specific patterns) and of roll-up (generalization from specific to more abstract patterns) are developed for the efficient workload evaluation. Our proposed execution strategies reuse intermediate results along both the concept and the pattern refinement relationships between queries. Based on this foundation, we design a cost-driven adaptive optimizer called Chase, that exploits the above reuse strategies for optimal E-Cube hierarchy execution. Our experimental studies comparing alternate strategies on a real world financial data stream under different workload conditions demonstrate the superiority of the Chase method. In particular, our Chase execution in many cases performs ten fold faster than the state-of-the art strategy for real stock market query workloads.

78 citations


Proceedings ArticleDOI
07 Sep 2011
TL;DR: An extract-transform-load (ETL) pipeline is used to convert statistical Linked Data into a format suitable for loading into an open-source OLAP system, and thus it is demonstrated how standard OLAP infrastructure can be used for elaborate querying and visualisation of integrated statistical Linking Data.
Abstract: The amount of available Linked Data on the Web is increasing, and data providers start to publish statistical datasets that comprise numerical data. Such statistical datasets differ significantly from the currently predominant network-style data published on the Web. We explore the possibility of integrating statistical data from multiple Linked Data sources. We provide a mapping from statistical Linked Data into the Multidimensional Model used in data warehouses. We use an extract-transform-load (ETL) pipeline to convert statistical Linked Data into a format suitable for loading into an open-source OLAP system, and thus demonstrate how standard OLAP infrastructure can be used for elaborate querying and visualisation of integrated statistical Linked Data. We discuss lessons learned from three experiments and identify areas which require future work to ultimately arrive at a well-interlinked set of statistical data from multiple sources which is processable with standard OLAP systems.

75 citations


Journal ArticleDOI
TL;DR: This paper presents myOLAP, an approach for expressing and evaluating OLAP preferences, devised by taking into account the three peculiarities of the OLAP domain, and proposes an algorithm called WeSt that relies on a novel graph representation where two types of domination between sets of facts may be expressed.
Abstract: Multidimensional databases are the core of business intelligence systems. Their users express complex OLAP queries, often returning large volumes of facts, sometimes providing little or no information. Thus, expressing preferences could be highly valuable in this domain. The OLAP domain is representative of an unexplored class of preference queries, characterized by three peculiarities: preferences can be expressed on both numerical and categorical domains; they can also be expressed on the aggregation level of facts; the space on which preferences are expressed includes both elemental and aggregated facts. In this paper, we present myOLAP, an approach for expressing and evaluating OLAP preferences, devised by taking into account the three peculiarities above. We first propose a preference algebra where users are enabled to express their preferences, besides on attributes and measures, also on the aggregation level of facts, for instance, by stating that monthly data are preferred to yearly and daily data. Then, with respect to preference evaluation, we propose an algorithm called WeSt that relies on a novel graph representation where two types of domination between sets of facts may be expressed, which considerably improves efficiency. The approach is extensively tested for efficiency and effectiveness on real data, and compared against two other approaches in the literature.

73 citations


Proceedings Article
01 Jan 2011
TL;DR: This work proposes a new type of database system coined OctopusDB, which uses a logical event log as its primary storage structure and introduces the concept of Storage Views (SV), i.e. secondary, alternative physical data representations covering all or subsets of the primary log.
Abstract: We propose a new type of database system coined OctopusDB. Our approach suggests a unified, one size fits all data processing architecture for OLTP, OLAP, streaming systems, and scan-oriented database systems. OctopusDB radically departs from existing architectures in the following way: it uses a logical event log as its primary storage structure. To make this approach efficient we introduce the concept of Storage Views (SV), i.e. secondary, alternative physical data representations covering all or subsets of the primary log. OctopusDB (1) allows us to use different types of SVs for different subsets of the data; and (2) eliminates the need to use different types of database systems for different applications. Thus, based on the workload, OctopusDB emulates different types of systems (row stores, column stores, streaming systems, and more importantly, any hybrid combination of these). This is a feature impossible to achieve with traditional DBMSs.

Patent
12 May 2011
TL;DR: In this article, the authors present a testing framework that automates the querying, extraction and loading of test data into a test result database from plurality of data sources and application interfaces using source specific adaptors.
Abstract: The present method and apparatus provides for automated testing of data integration and business intelligence projects using Extract, Load and Validate (ELV) architecture. The method and computer program product provides a testing framework that automates the querying, extraction and loading of test data into a test result database from plurality of data sources and application interfaces using source specific adaptors. The test data available for extraction using the adaptors include metadata such as the database query generated by the OLAP Tools that are critical to validate the changes in business intelligence systems. A validation module helps define validation rules for verifying the test data loaded into the test result database. The validation module further provides a framework for comparing the test data with previously archived test data as well as benchmark test data.

Patent
11 Jul 2011
TL;DR: In this paper, the authors present a system that simplifies the creation of multidimensional OLAP models from one or more semantically enabled data sources, including web-enabled OLAP interfaces.
Abstract: This system comprises methods that simplify the creation of multidimensional OLAP models from one or more semantically enabled data sources. The system also comprises methods enabling interoperability between existing OLAP end-user interfaces, the system's representation of OLAP and the underlying data sources. This includes web-enabled OLAP interfaces.

Patent
04 Apr 2011
TL;DR: In this article, a hybrid OLTP and OLAP database is maintained by using hardware-assisted replication mechanisms to maintain consistent snapshots of the transactional data, where the updated data object is accessible for OLTP transactions, while the non-updated data object remains accessible for OO queries.
Abstract: There is provided a method of maintaining a hybrid OLTP and OLAP database, the method comprising: executing one or more OLTP transactions; creating a virtual memory snapshot; and executing one or more OLAP queries using the virtual memory snapshot. Preferably, the method further comprises replicating a virtual memory page on which a data object is stored in response to an update to the data object, whereby the updated data object is accessible for OLTP transactions, while the non-updated data object remains accessible for OLAP queries. Accordingly, the present invention provides a hybrid systemthat can handle both OLTP and OLAP simultaneously by using hardware-assisted replication mechanisms to maintain consistent snapshots of the transactional data. Fig. 3

Patent
07 Dec 2011
TL;DR: In this paper, an OLTP database server adapts to receive a query according to the mode of operation (e.g., read or update) of the client application and the synchronization status of the OLAP database server.
Abstract: A computer system provides access to both an online transaction processing (OLTP) database server and an online analytics processing (OLAP) database server. The computer system includes a client application adapted to receive a query. According to (a) mode of operation (e.g., read or update) of the client application and (b) synchronization status of the OLAP database server, the client application redirects the query to the OLTP database server or to the OLAP database server. The client application redirects the query to the OLTP database server when the mode of operation is other than a read-only operation or the synchronization status is "unsynchronized". The client application redirects the query to the OLAP database server when the mode of operation is a read-only operation and the synchronization status is "synchronized". The computer system further includes an OLTP application server (e.g., Enovia V6) comprising an OLTP adapter and an OLAP adapter. The OLAP adapter is formed of a mapping component adapted to map data between OLTP semantics and OLAP semantics.

Book ChapterDOI
22 Apr 2011
TL;DR: Two effective computational techniques, T-Distributiveness and T-Monotonicity are proposed to achieve efficient query processing and cube materialization and this work provides a T-OLAP query processing framework into which these techniques are weaved.
Abstract: We propose a framework for efficient OLAP on information networks with a focus on the most interesting kind, the topological OLAP (called "TOLAP"), which incurs topological changes in the underlying networks. T-OLAP operations generate new networks from the original ones by rolling up a subset of nodes chosen by certain constraint criteria. The key challenge is to efficiently compute measures for the newly generated networks and handle user queries with varied constraints. Two effective computational techniques, T-Distributiveness and T-Monotonicity are proposed to achieve efficient query processing and cube materialization. We also provide a T-OLAP query processing framework into which these techniques are weaved. To the best of our knowledge, this is the first work to give a framework study for topological OLAP on information networks. Experimental results demonstrate both the effectiveness and efficiency of our proposed framework.

Journal ArticleDOI
TL;DR: A novel Secure Multiparty Computation (SMC)-based privacy preserving OLAP framework for distributed collections of XML documents is proposed, which has many novel features ranging from nice theoretical properties to an effective and efficient protocol, called Secure Distributed OLAP aggregation protocol (SDO).

Proceedings ArticleDOI
29 Aug 2011
TL;DR: It is shown how NoSQL databases such as MongoDB and its key-value stores, thanks to the native MapReduce algorithm, can provide an efficient framework to aggregate large volumes of data.
Abstract: Data aggregation is one of the key features used in databases, especially for Business Intelligence (e.g., ETL, OLAP) and analytics/data mining. When considering SQL databases, aggregation is used to prepare and visualize data for deeper analyses. However, these operations are often impossible on very large volumes of data regarding memory-and-time-consumption. In this paper, we show how NoSQL databases such as MongoDB and its key-value stores, thanks to the native MapReduce algorithm, can provide an efficient framework to aggregate large volumes of data. We provide basic material about the MapReduce algorithm, the different NoSQL databases (read intensive vs. write intensive). We investigate how to efficiently modelize the data framework for BI and analytics. For this purpose, we focus on read intensive NoSQL databases using MongoDB and we show how NoSQL and MapReduce can help handling large volumes of data.

Book ChapterDOI
20 Jul 2011
TL;DR: A novel framework for estimating OLAP queries over uncertain and imprecise multidimensional data streams is introduced, along with a probabilistic data stream model that exploits "natural" features of OLAP data, such as the presence of clusters and high correlations.
Abstract: In this paper, we introduce a novel framework for estimating OLAP queries over uncertain and imprecise multidimensional data streams, along with three relevant research contributions: (i) a probabilistic data stream model, which describes both precise and imprecise multidimensional data stream readings in terms of nice confidence-interval-based Probability Distribution Functions (PDF); (ii) a possible-world semantics for uncertain and imprecise multidimensional data streams, which is based on an innovative data-driven approach that exploits "natural" features of OLAP data, such as the presence of clusters and high correlations; (iii) an innovative approach for providing theoretically-founded estimates to OLAP queries over uncertain and imprecise multidimensional data streams that exploits the well-recognized probabilistic estimators theory.

Posted Content
TL;DR: The existing approaches and tools working in main memory and/or with web interfaces (including freeware tools), relevant for small and middle-sized enterprises in decision making are discussed.
Abstract: Data warehouses are the core of decision support sys- tems, which nowadays are used by all kind of enter- prises in the entire world. Although many studies have been conducted on the need of decision support systems (DSSs) for small businesses, most of them adopt ex- isting solutions and approaches, which are appropriate for large-scaled enterprises, but are inadequate for small and middle-sized enterprises. Small enterprises require cheap, lightweight architec- tures and tools (hardware and software) providing on- line data analysis. In order to ensure these features, we review web-based business intelligence approaches. For real-time analysis, the traditional OLAP architecture is cumbersome and storage-costly; therefore, we also re- view in-memory processing. Consequently, this paper discusses the existing approa- ches and tools working in main memory and/or with web interfaces (including freeware tools), relevant for small and middle-sized enterprises in decision making.

Book ChapterDOI
20 Sep 2011
TL;DR: A proactive approach that couples an MDX-based language for expressing OLAP preferences to a mining technique for automatically deriving preferences is proposed, which proves the effectiveness and efficiency of the approach.
Abstract: The goal of personalization is to deliver information that is relevant to an individual or a group of individuals in the most appropriate format and layout. In the OLAP context personalization is quite beneficial, because queries can be very complex and they may return huge amounts of data. Aimed at making the user's experience with OLAP as plain as possible, in this paper we propose a proactive approach that couples an MDX-based language for expressing OLAP preferences to a mining technique for automatically deriving preferences. First, the log of past MDX queries issued by that user is mined to extract a set of association rules that relate sets of frequent query fragments; then, given a specific query, a subset of pertinent and effective rules is selected; finally, the selected rules are translated into a preference that is used to annotate the user's query. A set of experimental results proves the effectiveness and efficiency of our approach.

Journal ArticleDOI
TL;DR: A framework for a recommender system for OLAP users that leverages former users' investigations to enhance discovery-driven analysis and is implemented in a system that uses the open source Mondrian server and recommends MDX queries.
Abstract: Recommending database queries is an emerging and promising field of research and is of particular interest in the domain of OLAP systems, where the user is left with the tedious process of navigating large datacubes. In this paper, the authors present a framework for a recommender system for OLAP users that leverages former users' investigations to enhance discovery-driven analysis. This framework recommends the discoveries detected in former sessions that investigated the same unexpected data as the current session. This task is accomplished by 1 analysing the query log to discover pairs of cells at various levels of detail for which the measure values differ significantly, and 2 analysing a current query to detect if a particular pair of cells for which the measure values differ significantly can be related to what is discovered in the log. This framework is implemented in a system that uses the open source Mondrian server and recommends MDX queries. Preliminary experiments were conducted to assess the quality of the recommendations in terms of precision and recall, as well as the efficiency of their on-line computation.

Patent
Xue C. Li1, Xiao J. Fu1, Xue F. Gao1, Xin Xin1
23 Feb 2011
TL;DR: In this paper, a method and system for validating data is presented, where a data cube is generated by transforming the warehouse data via an OLAP transformation model, and a reference dataset (S) is generated from the source data.
Abstract: A method and system for validating data. Warehouse data is generated by transforming source data via an ETL transformation model. A data cube is generated by transforming the warehouse data via an OLAP transformation model. A report dataset (MDS1) is generated from the data cube. A reference dataset (S) is generated from the source data. Whether MDS1 matches S is determined. If MDS1 doesn't match S, then an OLAP inverse transformation is performed on MDS1 to generate an OLAP dataset (MDS2) and whether MDS2 matches S is determined. If MDS1 doesn't match S and MDS2 does not match S, then an ETF inverse transformation is performed on MDS2 to generate an ETL dataset (MDS3) and whether MDS2 matches MDS1 and whether MDS3 matches S is determined. If MDS1 doesn't match S and MDS2 does not match S and MDS3 does not match S, then whether MDS3 matches MDS2 is determined.

Proceedings ArticleDOI
28 Oct 2011
TL;DR: This paper explores the possibility of having data in a cloud by using BigTable to store the corporate historical data and MapReduce as an agile mechanism to deploy cubes in ad-hoc Data Marts and the comparison of three different approaches to retrieve data cubes from BigTable by means of Map Reduce.
Abstract: In the last years, the problems of using generic storage techniques for very specific applications has been detected and outlined. Thus, some alternatives to relational DBMSs (e.g., BigTable) are blooming. On the other hand, cloud computing is already a reality that helps to save money by eliminating the hardware as well as software fixed costs and just pay per use. Indeed, specific software tools to exploit a cloud are also here. The trend in this case is toward using tools based on the MapReduce paradigm developed by Google. In this paper, we explore the possibility of having data in a cloud by using BigTable to store the corporate historical data and MapReduce as an agile mechanism to deploy cubes in ad-hoc Data Marts. Our main contribution is the comparison of three different approaches to retrieve data cubes from BigTable by means of MapReduce and the definition of criteria to choose among them.

Patent
14 Mar 2011
TL;DR: In this article, the authors present a computer implemented method of relating data and generating reports, which includes storing, by an OLAP system, a network data structure that relates a plurality of data objects.
Abstract: In one embodiment the present invention includes a computer implemented method of relating data and generating reports. The method includes storing, by an OLAP system, a network data structure that relates a plurality of data objects. The method further includes storing transactional data in an in-memory database in the OLAP system. The method further includes generating, by the OLAP system, a report using the stored transactional data according to the network data structure. In this manner, deficiencies of the traditional star schema paradigm of data warehousing may be avoided.

Patent
06 Apr 2011
TL;DR: In this article, a database access model and storage structure that efficiently support concurrent OLTP and OLAP activity independently of the data model or schema used, is described, and the storage structure and access model presented avoid the need to design schemas for particular workloads or query patterns.
Abstract: A database access model and storage structure that efficiently support concurrent OLTP and OLAP activity independently of the data model or schema used, are described. The storage structure and access model presented avoid the need to design schemas for particular workloads or query patterns and avoid the need to design or implement indexing to support specific queries. Indeed, the access model presented is independent of the database model used and can equally support relational, object and hierarchical models amongst others.

Patent
18 Jan 2011
TL;DR: In this paper, a system and methods for graphically distinguishing levels from a multidimensional database are described, such as associating two or more of database's levels with a plurality of different visual indicators.
Abstract: In accordance with the teachings described herein, systems and methods are provided for graphically distinguishing levels from a multidimensional database. Levels from a multidimensional database are distinguished, such as by associating two or more of database's levels with a plurality of different visual indicators.

Journal ArticleDOI
TL;DR: A methodology for providing linguistic answers to queries involving the comparison of time series obtained from data cubes with time dimension, based on linguistically quantified statements and pointwise definitions of the degree and sign of local change is proposed.
Abstract: In this paper, we propose a methodology for providing linguistic answers to queries involving the comparison of time series obtained from data cubes with time dimension. Time series related to events which are interesting for the user are obtained by querying data cubes using OnLine Analytical Processing (OLAP) operations on the time dimension. The comparison of these query results can be summarized so that an appropriate short linguistic description of the series is provided to the user. Our approach is based on linguistically quantified statements and pointwise definitions of the degree and sign of local change. Our linguistic summaries are well suited to be included in an interface layer of a data warehouse system, improving the quality of human-machine interaction and the understandability of the results. © 2011 Wiley Periodicals, Inc. © 2011 Wiley Periodicals, Inc.

Journal ArticleDOI
TL;DR: This paper claims that a spreadsheet-like query model, where formulation is done in a column-wise fashion, can express intuitively a large class of useful and practical RFDM queries and proposes a simple SQL extension to do that and shows how these queries can be evaluated efficiently.

Proceedings ArticleDOI
13 Jun 2011
TL;DR: An in-memory database system that separates transaction processing from OLAP query processing via periodically refreshed snapshots is designed, so that OLAP queries can be executed without any synchronization and OLTP transaction processing follows the lock-free, mostly serial processing paradigm of H-Store.
Abstract: The quest for real-time business intelligence requires executing mixed transaction and query processing workloads on the same current database state. However, as Harizopoulos et al. [6] showed for transactional processing, co-execution using classical concurrency control techniques will not yield the necessary performance -- even in re-emerging main memory database systems. Therefore, we designed an in-memory database system that separates transaction processing from OLAP query processing via periodically refreshed snapshots. Thus, OLAP queries can be executed without any synchronization and OLTP transaction processing follows the lock-free, mostly serial processing paradigm of H-Store [8]. In this paper, we analyze different snapshot mechanisms: Hardware-supported Page Shadowing, which lazily copies memory pages when changed by transactions, software controlled Tuple Shadowing, which generates a new version when a tuple is modified, software controlled Twin Tuple, which constantly maintains two versions of each tuple and HotCold Shadowing, which effectively combines Tuple Shadowing and hardware-supported Page Shadowing by clustering update-intensive objects. We evaluate their performance based on the mixed workload CH-BenCHmark which combines the TPC-C and the TPC-H benchmarks on the same database schema and state.