scispace - formally typeset
Search or ask a question

Showing papers on "Online analytical processing published in 1995"


Journal ArticleDOI
TL;DR: Today's data-mining techniques range from OLAP (online application processing) tools that query multidimensional databases, to various statistical techniques, to advanced artificial intelligence techniques like machine learning, neural networks, rule-based systems and genetic algorithms.
Abstract: Increasingly, scientists and business people are mining mountains of stored data for valuable nuggets of insight using computerized data mining (also called knowledge discovery in databases, or KDD) techniques to learn more about customer behavior, discover new quasars, and catch crooks. Data mining recognizes patterns in data and predicts patterns from the data. Today's data-mining techniques, sometimes called 'siftware', range from OLAP (online application processing) tools that query multidimensional databases, to various statistical techniques, to advanced artificial intelligence techniques like machine learning, neural networks, rule-based systems and genetic algorithms. To efficiently mine gigabytes and terabytes of data in a timely fashion, a growing number of organizations are turning to parallelism to speed up the processing. The US Department of Treasury, for example, has fielded a data-mining application that sifts through all large cash transactions reported by banks and casinos to detect potential money laundering. The application runs on a 6-processor Sun server. >

27 citations


Journal ArticleDOI
TL;DR: The present experiments varied list length to examine the hypothesis that the NOLAP advantage is produced by recall-like retrieval processes, and found that a reliable OLAP advantage was not obtained.
Abstract: Two experiments examined forced-choice associative recognition for OLAP and NOLAP test conditions. OLAP test trials consist of pairs with overlapping items (e.g., AB vs. AD), whereas NOLAP test trials contain no overlapping items (e.g., AB vs. CF). Previous results show better performance for NOLAP than for OLAP tests, contrary to the predictions of global memory models. The present experiments varied list length to examine the hypothesis that the NOLAP advantage is produced by recall-like retrieval processes. The use of longer lists either eliminated (Experiment 1) or greatly reduced (Experiment 2) the NOLAP advantage. However, a reliable OLAP advantage was not obtained. Implications for models are discussed.

19 citations



01 Jan 1995
TL;DR: This work investigates whether the requirements for visualisation would differ if the input to the knowledge discovery comprises aggregated data instead of raw unprocessed data and argues that a modified MDV can provide superior results for KDD.
Abstract: Introduction Knowledge discovery in databases (KDD) has as its primary objective the uncovering of new and useful knowledge from huge masses of raw data. Our research in KDD has focused on how visualisation can be exploited in the dialectic process of knowledge discovery [6]. Our approach uses a multi-dimensional data visualisation (MDV) technique that builds upon a refined and improved method of parallel coordinates. We also explore how visualisation can be used synergistically with other techniques such as inductive learning (e.g., C4.5) and hierarchical clustering to harness their respective strengths in KDD. Our recent work investigates whether the requirements for visualisation would differ if the input to the knowledge discovery comprises aggregated data instead of raw unprocessed data. In particular, we concentrate on the multi-dimensional model, an emerging model for analysis, that underlies the on-line analytical processing (OLAP) concept [2]. The OLAP camp uses the two-dimensional cross-tabulation table as its visualisation tool. We argue that this is inadequate and that a modified MDV can provide superior results for KDD.

10 citations


01 Jan 1995
TL;DR: In order to satisfy the OLAP requirements, or how to efficiently get the data out of the system, different models, metaphors, and theories have been devised, all of them pointing to the need for simplifying the highly non-intuitive mathematical constraints found in the relational databases normalized to their 3rd normal form.
Abstract: The focus of information processing requirements is shifting from the on-line transaction processing (OLTP) issues to the on-line analytical processing (OLAP) issues. While the former serves to ensure the feasibility of the real-time on-line transaction processing (which has already exceeded a level of up to 1,000 transactions per second under normal conditions), the latter aims at enabling more sophisticated analytical manipulation of data. The OLTP requirements, or how to efficiently get data into the system, have been solved by applying the Relational theory in the form of Entity-Relation model. There is presently no theory related to OLAP that would resolve the analytical processing requirements as efficiently as Relational theory provided for the transaction processing. The "relational dogma" also provides the mathematical foundation for the Centralized Data Processing paradigm in which mission-critical information is incorporated as 'one and only one instance' of data, thus ensuring data integrity. In such surroundings, the information that supports business analysis and decision support activities is obtained by running predefined reports and queries that are provided by the IS department. In today's intensified competitive climate, businesses are finding that this traditional approach is not good enough. The only way to stay on top of things, and to survive and prosper, is to decentralize the IS services. The newly emerging Distributed Data Processing, with its increased emphasis on empowering the end user, does not seem to find enough merit in the relational database model to justify relying upon it. Relational theory proved too rigid and complex to accommodate the analytical processing needs. In order to satisfy the OLAP requirements, or how to efficiently get the data out of the system, different models, metaphors, and theories have been devised. All of them are pointing to the need for simplifying the highly non-intuitive mathematical constraints found in the relational databases normalized to their 3rd normal form. Object-oriented approach insists on the importance of the common sense component of the data processing activities. But, particularly interesting, is the approach that advocates the necessity of 'flattening' the structure of the business models as we know them today. This discipline is called Dimensional Modeling and it enables users to form multidimensional views of the relevant facts which are stored in a 'flat' (non-structured), easy-to-comprehend and easy-to-access database. When using dimensional modeling, we relax many of the axioms inherent in a relational model. We focus on the knowledge of the relevant facts which are reflecting the business operations and are the real basis for the decision support and business analysis. At the core of the dimensional modeling are fact tables that contain the non-discrete, additive data. To determine the level of aggregation of these facts, we use granularity tables that specify the resolution, or the level/detail, that the user is allowed to entertain. The third component is dimension tables that embody the knowledge of the constraints to be used to form the views.

2 citations