scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

Optimal Space and Time Complexity Analysis on the Lattice of Cuboids Using Galois Connections for Data Warehousing

24 Nov 2009-pp 1271-1275
TL;DR: An optimal aggregation and counter-aggregation (drill-down) methodology is proposed on multidimensional data cube to aggregate on smaller cuboids after partitioning those depending on the cardinality of the individual dimensions.
Abstract: In this paper, an optimal aggregation and counter-aggregation (drill-down) methodology is proposed on multidimensional data cube. The main idea is to aggregate on smaller cuboids after partitioning those depending on the cardinality of the individual dimensions. Based on the operations to make these partitions, a Galois Connection is identified for formal analysis that allow to guarantee the soundness of optimizations of storage space and time complexity for the abstraction and concretization functions defined on the lattice structure. Our contribution can be seen as an application to OLAP operations on multidimensional data model in the Abstract Interpretation framework.
Citations
More filters
Journal ArticleDOI
TL;DR: This research work dynamically finds the most cost effective path from the lattice structure of cuboids based on concept hierarchy to minimize the query access time.

13 citations

Proceedings ArticleDOI
19 Feb 2011
TL;DR: A new algorithm has been proposed using a dynamic data structure that reduces over time resulting in better space utilization and also reduction of computation time and offers formalism in analysis using concept hierarchy in an abstract interpretation framework.
Abstract: This paper propose a new methodology for efficient implementation of OLAP operations using concept hierarchies of attributes in a data warehouse. The different granularity associated with a particular dimension and the hierarchy amongst those may be represented as a lattice. The focus is to move up (roll-up) and down (drill-down) within the lattice structure using an algorithm with optimal time complexity. In this paper, a new algorithm has been proposed using a dynamic data structure that reduces over time resulting in better space utilization and also reduction of computation time. A Galois Connection is identified on this lattice structure with well-defined abstraction and concretization functions based on the concept hierarchy. The contribution offers formalism in analysis using concept hierarchy in an abstract interpretation framework.

9 citations


Additional excerpts

  • ...In contrast to the proposed work in [1], a particular dimension has been considered for which the “concept hierarchy” prevails....

    [...]

Book ChapterDOI
TL;DR: A new methodology for efficient implementation by forming lattice on query parameters helps to co-relate the different query parameters that in turn form association rules among them.
Abstract: This research work is on optimizing the number of query parameters required to recommend an e-learning platform This paper proposes a new methodology for efficient implementation by forming lattice on query parameters This lattice structure helps to co-relate the different query parameters that in turn form association rules among them The proposed methodology is conceptualized on an e-learning platform with the objective of formulating an effective recommendation system to determine associations between various products offered by the e-learning platform by analyzing the minimum set of query parameters

8 citations

Book ChapterDOI
01 Jan 2019
TL;DR: In this paper, the authors proposed a data warehouse model which integrates the existing parameters of loan disbursement related decisions and also incorporates the newly identified concepts to give the priorities to the customers who don't have any old credit history.
Abstract: Disbursement of loan is an important decision-making process for the corporate like banks and NBFC (Non-banking Finance Corporation) those offers loans. The business involves several parameters and the data which are associated to these parameters are generated from heterogeneous data sources and also belong to different business verticals. Henceforth the decision-making on loan scenarios are critical and the outcome involve solving the issues like whether to grant the loan or not, if sanctioned what is highest amount, etc. In this paper we consider the traditional parameters of loan sanction process along with these we identify one special case of Indian credit lending scenario where the people having old loans with good repayment history get priority. This limits the business opportunities for Bank/NBFC or other loan disbursement organizations as potential good customers having no loan history are treated with less priority. In this research work we propose a data warehouse model which integrates the existing parameters of loan disbursement related decisions and also incorporates the newly identified concepts to give the priorities to the customers who don’t have any old credit history.

5 citations

Book ChapterDOI
01 Jan 2021
TL;DR: This paper proposes an algorithm for cuboid materialization starting from a source cuboid to the target cuboid in an optimal way such that the intermediate cuboids consume less space and require lower time to generate by making sure those cuboids have the least number of rows compared to other valid cuboids available for selection.
Abstract: In the field of business intelligence, we require the analysis of multidimensional data with the need for it being fast and interactive. Data warehousing and OLAP approaches have been developed for this purpose in which the data is viewed in the form of a multidimensional data cube which allows interactive analysis of the data in various levels of abstraction presented in a graphical manner. In data cube, there may arise a need to materialize a particular cuboid given that some other cuboid is presently materialized, in this paper, we propose an algorithm for cuboid materialization starting from a source cuboid to the target cuboid in an optimal way such that the intermediate cuboids consume less space and require lower time to generate by making sure those cuboids have the least number of rows compared to other valid cuboids available for selection, by sorting them based on the product of cardinalities of dimensions present in each cuboid.

5 citations

References
More filters
Book
08 Sep 2000
TL;DR: This book presents dozens of algorithms and implementation examples, all in pseudo-code and suitable for use in real-world, large-scale data mining projects, and provides a comprehensive, practical look at the concepts and techniques you need to get the most out of real business data.
Abstract: The increasing volume of data in modern business and science calls for more complex and sophisticated tools. Although advances in data mining technology have made extensive data collection much easier, it's still always evolving and there is a constant need for new techniques and tools that can help us transform this data into useful information and knowledge. Since the previous edition's publication, great advances have been made in the field of data mining. Not only does the third of edition of Data Mining: Concepts and Techniques continue the tradition of equipping you with an understanding and application of the theory and practice of discovering patterns hidden in large data sets, it also focuses on new, important topics in the field: data warehouses and data cube technology, mining stream, mining social networks, and mining spatial, multimedia and other complex data. Each chapter is a stand-alone guide to a critical topic, presenting proven algorithms and sound implementations ready to be used directly or with strategic modification against live data. This is the resource you need if you want to apply today's most powerful data mining techniques to meet real business challenges. * Presents dozens of algorithms and implementation examples, all in pseudo-code and suitable for use in real-world, large-scale data mining projects. * Addresses advanced topics such as mining object-relational databases, spatial databases, multimedia databases, time-series databases, text databases, the World Wide Web, and applications in several fields. *Provides a comprehensive, practical look at the concepts and techniques you need to get the most out of real business data

23,600 citations


"Optimal Space and Time Complexity A..." refers background in this paper

  • ...A partial order amongst the cuboids is observed in the sense that there exists cuboid Ci that may or may not be derived from some other cuboid Cj. The resulting Poset, in fact, would form a lattice of cuboids [ 7 ]....

    [...]

Proceedings Article
01 Jul 1998
TL;DR: Two new algorithms for solving thii problem that are fundamentally different from the known algorithms are presented and empirical evaluation shows that these algorithms outperform theknown algorithms by factors ranging from three for small problems to more than an order of magnitude for large problems.
Abstract: We consider the problem of discovering association rules between items in a large database of sales transactions. We present two new algorithms for solving thii problem that are fundamentally different from the known algorithms. Empirical evaluation shows that these algorithms outperform the known algorithms by factors ranging from three for small problems to more than an order of magnitude for large problems. We also show how the best features of the two proposed algorithms can be combined into a hybrid algorithm, called AprioriHybrid. Scale-up experiments show that AprioriHybrid scales linearly with the number of transactions. AprioriHybrid also has excellent scale-up properties with respect to the transaction size and the number of items in the database.

10,863 citations

Proceedings ArticleDOI
01 Jan 1977
TL;DR: In this paper, the abstract interpretation of programs is used to describe computations in another universe of abstract objects, so that the results of abstract execution give some information on the actual computations.
Abstract: A program denotes computations in some universe of objects. Abstract interpretation of programs consists in using that denotation to describe computations in another universe of abstract objects, so that the results of abstract execution give some information on the actual computations. An intuitive example (which we borrow from Sintzoff [72]) is the rule of signs. The text -1515 * 17 may be understood to denote computations on the abstract universe {(+), (-), (±)} where the semantics of arithmetic operators is defined by the rule of signs. The abstract execution -1515 * 17 → -(+) * (+) → (-) * (+) → (-), proves that -1515 * 17 is a negative number. Abstract interpretation is concerned by a particular underlying structure of the usual universe of computations (the sign, in our example). It gives a summary of some facets of the actual executions of a program. In general this summary is simple to obtain but inaccurate (e.g. -1515 + 17 → -(+) + (+) → (-) + (+) → (±)). Despite its fundamentally incomplete results abstract interpretation allows the programmer or the compiler to answer questions which do not need full knowledge of program executions or which tolerate an imprecise answer, (e.g. partial correctness proofs of programs ignoring the termination problems, type checking, program optimizations which are not carried in the absence of certainty about their feasibility, …).

6,829 citations


"Optimal Space and Time Complexity A..." refers background in this paper

  • ...In figure 1, there are four different attributes labeled as 1, 2, 3, and 4....

    [...]

Proceedings Article
03 Sep 1996
TL;DR: In this article, the authors present fast algorithms for computing a collection of group bys, which is equivalent to the union of a number of standard group-by operations, and show how the structure of CUBE computation can be viewed in terms of a hierarchy of groupby operations.
Abstract: At the heart of all OLAP or multidimensional data analysis applications is the ability to simultaneously aggregate across many sets of dimensions. Computing multidimensional aggregates is a performance bottleneck for these applications. This paper presents fast algorithms for computing a collection of group bys. We focus on a special case of the aggregation problem - computation of the CUBE operator. The CUBE operator requires computing group-bys on all possible combinations of a list of attributes, and is equivalent to the union of a number of standard group-by operations. We show how the structure of CUBE computation can be viewed in terms of a hierarchy of group-by operations. Our algorithms extend sort-based and hashbased grouping methods with several .optimizations, like combining common operations across multiple groupbys, caching, and using pre-computed group-by8 for computing other groupbys. Empirical evaluation shows that the resulting algorithms give much better performance compared to straightforward meth

608 citations

Book ChapterDOI
12 Jul 2001
TL;DR: An ad-hoc grouping hierarchy based on the spatial index at the finest spatial granularity is constructed and incorporated in the lattice model and efficient methods to process arbitrary aggregations are presented.
Abstract: Spatial databases store information about the position of individual objects in space. In many applications however, such as traffic supervision or mobile communications, only summarized data, like the number of cars in an area or phones serviced by a cell, is required. Although this information can be obtained from transactional spatial databases, its computation is expensive, rendering online processing inapplicable. Driven by the non-spatial paradigm, spatial data warehouses can be constructed to accelerate spatial OLAP operations. In this paper we consider the star-schema and we focus on the spatial dimensions. Unlike the non-spatial case, the groupings and the hierarchies can be numerous and unknown at design time, therefore the well-known materialization techniques are not directly applicable. In order to address this problem, we construct an ad-hoc grouping hierarchy based on the spatial index at the finest spatial granularity. We incorporate this hierarchy in the lattice model and present efficient methods to process arbitrary aggregations. We finally extend our technique to moving objects by employing incremental update methods.

367 citations


"Optimal Space and Time Complexity A..." refers background in this paper

  • ...There has been work on the efficiency of OLAP operations on spatial data in a data warehouse [ 9 ]....

    [...]

  • ...There have been a number of works on computational aspects of roll-up, drill-down and other OLAP operations [1, 2, 3, 6, 8, 9 , 10, 11]....

    [...]