scispace - formally typeset
Search or ask a question

Showing papers on "Apriori algorithm published in 2012"


Journal ArticleDOI
TL;DR: This paper provides an overview of the foundations of frequent item set mining, starting from a definition of the basic notions and the core task, and discusses how the search space is structured to avoid redundant search, how the output is reduced by confining it to closed or maximal item sets or generators.
Abstract: Frequent item set mining is one of the best known and most popular data mining methods. Originally developed for market basket analysis, it is used nowadays for almost any task that requires discovering regularities between (nominal) variables. This paper provides an overview of the foundations of frequent item set mining, starting from a definition of the basic notions and the core task. It continues by discussing how the search space is structured to avoid redundant search, how it is pruned with the a priori property, and how the output is reduced by confining it to closed or maximal item sets or generators. In addition, it reviews some of the most important algorithmic techniques and data structures that were developed to make the search for frequent item sets as efficient as possible. © 2012 Wiley Periodicals, Inc. © 2012 Wiley Periodicals, Inc.

265 citations


Proceedings ArticleDOI
20 Feb 2012
TL;DR: DPC features in dynamically combining candidates of various lengths and outperforms both the straight-forward algorithm SPC and the fixed passes combined counting algorithm FPC, and shows that all the three algorithms scale up linearly with respect to dataset sizes and cluster sizes.
Abstract: Many parallelization techniques have been proposed to enhance the performance of the Apriori-like frequent itemset mining algorithms. Characterized by both map and reduce functions, MapReduce has emerged and excels in the mining of datasets of terabyte scale or larger in either homogeneous or heterogeneous clusters. Minimizing the scheduling overhead of each map-reduce phase and maximizing the utilization of nodes in each phase are keys to successful MapReduce implementations. In this paper, we propose three algorithms, named SPC, FPC, and DPC, to investigate effective implementations of the Apriori algorithm in the MapReduce framework. DPC features in dynamically combining candidates of various lengths and outperforms both the straight-forward algorithm SPC and the fixed passes combined counting algorithm FPC. Extensive experimental results also show that all the three algorithms scale up linearly with respect to dataset sizes and cluster sizes.

225 citations


Proceedings ArticleDOI
08 Aug 2012
TL;DR: A parallel Apriori algorithm based on MapReduce, which is a framework for processing huge datasets on certain kinds of distributable problems using a large number of computers (nodes), is implemented.
Abstract: Searching frequent patterns in transactional databases is considered as one of the most important data mining problems and Apriori is one of the typical algorithms for this task. Developing fast and efficient algorithms that can handle large volumes of data becomes a challenging task due to the large databases. In this paper, we implement a parallel Apriori algorithm based on MapReduce, which is a framework for processing huge datasets on certain kinds of distributable problems using a large number of computers (nodes). The experimental results demonstrate that the proposed algorithm can scale well and efficiently process large datasets on commodity hardware.

144 citations


Proceedings ArticleDOI
01 Jun 2012
TL;DR: This paper applies Apriori algorithm to the database containing academic records of various students and tries to extract association rules in order to profile students based on various parameters like exam scores, term work grades, attendance and practical exams.
Abstract: Data mining is a process of identifying and extracting hidden patterns and information from databases and data warehouses. There are various algorithms and tools available for this purpose. Data mining has a vast range of applications ranging from business to medicine to engineering. In this paper, we discuss the application of data mining in education for student profiling and grouping. We make use of Apriori algorithm for student profiling which is one of the popular approaches for mining associations i.e. discovering co-relations among set of items. The other algorithm used, for grouping students is K-means clustering which assigns a set of observations into subsets. In the field of academics, data mining can be very useful in discovering valuable information which can be used for profiling students based on their academic record. We apply Apriori algorithm to the database containing academic records of various students and try to extract association rules in order to profile students based on various parameters like exam scores, term work grades, attendance and practical exams. We also apply K-means clustering to the same set of data in order to group the students. The implemented algorithms offer an effective way of profiling students which can be used in educational systems.

94 citations


Journal ArticleDOI
TL;DR: The results of knowledge extraction from data mining are illustrated as knowledge patterns, rules, and knowledge maps in order to propose suggestions and solutions to online group buying firms for future development.
Abstract: Highlights? Online group buying is an effective marketing method. ? Group buying has become extremely popular. ? This study proposes a data mining approach for exploring online group buying behavior in Taiwan. ? This study uses the Apriori algorithm and clustering analysis for data mining. ? Knowledge extraction is proposed suggestions to online group buying firms for future development. Online group buying is an effective marketing method. By using online group buying, customers get unbelievable discounts on premium products and services. This not only meets customer demand, but also helps sellers to find new ways to sell products sales and open up new business models, all parties benefit in these transactions. During these bleak economic times, group buying has become extremely popular. Therefore, this study proposes a data mining approach for exploring online group buying behavior in Taiwan. Thus, this study uses the Apriori algorithm as an association rules approach, and clustering analysis for data mining, which is implemented for mining customer knowledge among online group buying customers in Taiwan. The results of knowledge extraction from data mining are illustrated as knowledge patterns, rules, and knowledge maps in order to propose suggestions and solutions to online group buying firms for future development.

90 citations


Journal ArticleDOI
Liang Wang1, D. W-L Cheung1, Reynold Cheng1, Sau Dan Lee1, Xuan Yang1 
TL;DR: This paper proposes incremental mining algorithms, which enable Probabilistic Frequent Item set (PFI) results to be refreshed, and develops an approximate algorithm, which can efficiently and accurately discover frequent item sets in a large uncertain database.
Abstract: The data handled in emerging applications like location-based services, sensor monitoring systems, and data integration, are often inexact in nature. In this paper, we study the important problem of extracting frequent item sets from a large uncertain database, interpreted under the Possible World Semantics (PWS). This issue is technically challenging, since an uncertain database contains an exponential number of possible worlds. By observing that the mining process can be modeled as a Poisson binomial distribution, we develop an approximate algorithm, which can efficiently and accurately discover frequent item sets in a large uncertain database. We also study the important issue of maintaining the mining result for a database that is evolving (e.g., by inserting a tuple). Specifically, we propose incremental mining algorithms, which enable Probabilistic Frequent Item set (PFI) results to be refreshed. This reduces the need of re-executing the whole mining algorithm on the new database, which is often more expensive and unnecessary. We examine how an existing algorithm that extracts exact item sets, as well as our approximate algorithm, can support incremental mining. All our approaches support both tuple and attribute uncertainty, which are two common uncertain database models. We also perform extensive evaluation on real and synthetic data sets to validate our approaches.

86 citations


Journal ArticleDOI
TL;DR: The proposed framework integrating network intrusion detection system (NIDS) in the Cloud consists of Snort and signature apriori algorithm, which aims to detect known attacks and derivative of known attacks in Cloud by monitoring network traffic, while ensuring low false positive rate with reasonable computational cost.

72 citations


01 Jan 2012
TL;DR: A FIS data mining association algorithm that removes the disadvantages of APRIORI algorithm and is efficient in terms of number of database scan and time and the costly candidate generation is discovered.
Abstract: In this paper we present new scheme for extracting association rules that considers the time, number of database scans, memory consumption, and the interestingness of the rules. Discover a FIS data mining association algorithm that removes the disadvantages of APRIORI algorithm and is efficient in terms of number of database scan and time. The frequent patterns algorithm without candidate generation eliminates the costly candidate generation. It also avoids scanning the database again and again. So, we use Frequent Pattern (FP) Growth ARM algorithm that is more efficient structure to mine patterns when database grows.

71 citations


Proceedings ArticleDOI
01 Apr 2012
TL;DR: It is proved that even a sub-problem of this problem, computing the frequent closed probability of an item set, is #P-Hard, and an efficient mining algorithm is developed based on depth-first search strategy to obtain all probabilistic frequent closed item sets.
Abstract: In recent years, many new applications, such as sensor network monitoring and moving object search, show a growing amount of importance of uncertain data management and mining. In this paper, we study the problem of discovering threshold-based frequent closed item sets over probabilistic data. Frequent item set mining over probabilistic database has attracted much attention recently. However, existing solutions may lead an exponential number of results due to the downward closure property over probabilistic data. Moreover, it is hard to directly extend the successful experiences from mining exact data to a probabilistic environment due to the inherent uncertainty of data. Thus, in order to obtain a reasonable result set with small size, we study discovering frequent closed item sets over probabilistic data. We prove that even a sub-problem of this problem, computing the frequent closed probability of an item set, is #P-Hard. Therefore, we develop an efficient mining algorithm based on depth-first search strategy to obtain all probabilistic frequent closed item sets. To reduce the search space and avoid redundant computation, we further design several probabilistic pruning and bounding techniques. Finally, we verify the effectiveness and efficiency of the proposed methods through extensive experiments.

65 citations


01 Jan 2012
TL;DR: An efficient MapReduce Apriori algorithm (MRApriori) based on HadoopMapReduce model which needs only two phases (Map Reduce Jobs) to find all frequent k-itemsets is implemented and compared with current two existed algorithms which need either one or k phases to find the same frequent itemsets.
Abstract: Finding frequent itemsets is one of the most important fields of data mining. Apriori algorithm is the most established algorithm for finding frequent itemsets from a transactional dataset; however, it needs to scan the dataset many times and to generate many candidate itemsets. Unfortunately, when the dataset size is huge, both memory use and computational cost can still be very expensive. In addition, single processor’s memory and CPU resources are very limited, which make the algorithm performance inefficient. Parallel and distributed computing are effective strategies for accelerating algorithms performance. In this paper, we have implemented an efficient MapReduce Apriori algorithm (MRApriori) based on HadoopMapReduce model which needs only two phases (MapReduce Jobs) to find all frequent k-itemsets, and compared our proposed MRApriori algorithm with current two existed algorithms which need either one or k phases (k is maximum length of frequent itemsets) to find the same frequent k-itemsets. Experimental results showed that the proposed MRApriori algorithm outperforms the other two algorithms.

63 citations


Journal ArticleDOI
TL;DR: Rules in Web-based Intrusion Detection System are applied and the rule base generated by the Apriori algorithm is applied to identify a variety of attacks, improves the overall performance of the detection system.

Book ChapterDOI
29 May 2012
TL;DR: This paper describes Sharemind--a toolkit, which allows data mining specialist with no cryptographic expertise to develop data mining algorithms with good security guarantees, and lists the building blocks needed to deploy a privacy-preserving data mining application and explains the design decisions that make Sharemind applications efficient in practice.
Abstract: The issue of potential data misuse rises whenever it is collected from several sources. In a common setting, a large database is either horizontally or vertically partitioned between multiple entities who want to find global trends from the data. Such tasks can be solved with secure multi-party computation (MPC) techniques. However, practitioners tend to consider such solutions inefficient. Furthermore, there are no established tools for applying secure multi-party computation in real-world applications. In this paper, we describe Sharemind--a toolkit, which allows data mining specialist with no cryptographic expertise to develop data mining algorithms with good security guarantees. We list the building blocks needed to deploy a privacy-preserving data mining application and explain the design decisions that make Sharemind applications efficient in practice. To validate the practical feasibility of our approach, we implemented and benchmarked four algorithms for frequent itemset mining.

Journal ArticleDOI
TL;DR: This paper proposes an algorithm that mines negative association rules by using conviction measure which does not require extra database scans, and is very convenient for associative classifiers, classifiers that build their classification model based on association rules.
Abstract: Association rule mining is a data mining task that discovers associations among items in a transactional database. Typical association rules consider only items enumerated in transactions. Such rules are referred to as positive association rules. Negative association rules also consider the same items, but in addition consider negated items (i.e. absent from transactions). Negative association rules are useful in market- basket analysis to identify products that conflict with each other or products that complement each other. They are also very convenient for associative classifiers, classifiers that build their classification model based on association rules. Many other applications would benefit from negative association rules if it was not for the expensive process to discover them. Indeed, mining for such rules necessitates the examination of an exponentially large search space. In this paper, we propose an algorithm that mines negative association rules by using conviction measure which does not require extra database scans.

Journal ArticleDOI
TL;DR: This study represents a recommendation engine which was developed to personalize an e-commerce website and shows that the recommendation engine increases the basket ratio.

Proceedings ArticleDOI
03 Aug 2012
TL;DR: The algorithm in java is developed which is the combination of Simple K-means clustering & Apriori association rule algorithm which contains more number of association rules as compare to the result obtained using the open source data mining tool, Weka.
Abstract: In this paper we consider the applicability of data mining algorithms such as clustering & association rule algorithm for recommending the courses to the student in E-Learning System e.g. the student who liked to study the course "Operating System" is quite like to study the course "Distributed System". We develop the algorithm in java which is the combination of Simple K-means clustering & Apriori association rule algorithm. The result we obtained using these combinations are compared with the result we get using open source data mining tool, Weka & present the same. Results using this developed algorithm contain more number of association rules as compare to the result we obtained using the Weka.

Journal ArticleDOI
01 May 2012
TL;DR: The proposed algorithm integrates imprecise data concepts and the fuzzy apriori mining algorithm to find interesting fuzzy association rules in given databases to address the problem of imprecision in data mining.
Abstract: Data mining is most commonly used in attempts to induce association rules from databases which can help decision-makers easily analyze the data and make good decisions regarding the domains concerned. Different studies have proposed methods for mining association rules from databases with crisp values. However, the data in many real-world applications have a certain degree of imprecision. In this paper we address this problem, and propose a new data-mining algorithm for extracting interesting knowledge from databases with imprecise data. The proposed algorithm integrates imprecise data concepts and the fuzzy apriori mining algorithm to find interesting fuzzy association rules in given databases. Experiments for diagnosing dyslexia in early childhood were made to verify the performance of the proposed algorithm.

Journal ArticleDOI
TL;DR: This paper uses ADTree classification algorithm, Simple K-means Algorithm & Apriori Association Rule algorithm as different machine learning algorithm to find the best combination of algorithm in recommending the courses to students in E-learning.
Abstract: Data Mining is the extraction of hidden predictive information from large database which can be used in various commercial applications like bioinformatics, Ecommerce etc. Association Rule, classification and clustering are three different algorithms in data mining. Course Recommender System plays an important role in identifying the behavior of students interested in particular set of courses. We collect the data regarding the course enrollment for specific set of data. For collecting this data, we use the learning management system like Moodle. After collecting the data, we apply the different combination of data mining algorithm like classification & association rule algorithm, clustering & association rule algorithm, association rule mining in classified & clustered data, combining clustering & classification algorithm in association rule algorithms or simply the association rule algorithm. Here in this paper we use ADTree classification algorithm, Simple K-means Algorithm & Apriori Association Rule algorithm as different machine learning algorithm. So we propose the five different methods to find the best combination of algorithm in recommending the courses to students in E-learning. We compare the result of this combined approach as well as only the association rule algorithm & present the best combination of algorithm for recommendation of courses in E-learning according to our simulation.

Proceedings ArticleDOI
16 Apr 2012
TL;DR: This paper has improved the Apriori algorithm for mining association rules and made it applicable to the OTSNs and shows that the scheme outperforms the existing schemes in terms of energy efficiency and accuracy of tracking.
Abstract: Recently, object tracking application of sensor networks has drawn significant attention of the researchers due to its wide application. However, most of these studies cannot deal with the trade-off between energy efficiency and accuracy of the tracking. In object tracking sensor networks (OTSNs), the movement of the object generally follows some definite patterns. The moving object location, arrival time, path are likely to hide some useful association rules, which can be excavated by applying suitable data mining algorithm. In this paper, we have proposed an object tracking scheme for OTSNs using data mining approach. We have improved the Apriori algorithm for mining association rules and made it applicable to the OTSNs. The data mining algorithm is applied to the past movement information of the object and useful association rules are excavated, which are then used to predict the next location of the object. Our scheme predicts the next location of the object more accurately and increases the network lifetime. Experimental results have been conducted to evaluate the performance of our proposed scheme for OTSNs and they show that our scheme outperforms the existing schemes in terms of energy efficiency and accuracy of tracking.

Journal ArticleDOI
TL;DR: A procedure to discover customers’ markets and rules, which adopts the recency, frequency, monetary value (RFM) variables, transaction records, and socioeconomic data of the online shoppers to be the research variables is proposed.
Abstract: Purpose – The purpose of this paper is to establish customers’ markets and rules of dynamic customer relationship management (CRM) systems for online retailers.Design/methodology/approach – This research proposes a procedure to discover customers’ markets and rules, which adopts the recency, frequency, monetary value (RFM) variables, transaction records, and socioeconomic data of the online shoppers to be the research variables. The research methods aim at the supervised apriori algorithm, C5.0 decision tree algorithm, and RFM model.Findings – This research discovered eight RFM markets and six rules of online retailers.Practical implications – The proposed framework and research results can help retailer managers to retain and expand high value markets via their dynamic CRM and POS systems.Originality/value – This research uses data mining technologies to extract high value markets and rules for marketing plans. The research variables are easy to obtain via retailers’ systems. The found customer values, R...

Journal ArticleDOI
TL;DR: A Fuzzy Expert System is designed based on the selected rules from association rules to specify the Credit Degree of banks' customers by classifying the bank's customers via association rules with the use of the APRIORI algorithm and CRISP-DM methodology.
Abstract: Credit assessment is a very typical classification problem in Data Mining. A type of classification technique that has attracted an increasing number of attempts in recent years is finding classification rules based on association rule mining techniques. This paper aims to contribute to this kind of research by classifying the bank's customers via association rules with the use of the APRIORI algorithm and CRISP-DM methodology and considering the Experts' opinions to filter the obtained rules and define the Membership functions for the considered criteria, finally a Fuzzy Expert System is designed based on the selected rules from association rules to specify the Credit Degree of banks' customers. The presented steps have been studied in an Iranian Bank as empirical study.

Proceedings ArticleDOI
21 Mar 2012
TL;DR: A fuzzy weighted association rule mining with GNP framework suitable for both continuous and discrete attributes and follows an Apriori algorithm based fuzzy WAR and GNP and avoids pre and post processing thus eliminating the extra steps during rules generation.
Abstract: In conventional network security simply relies on mathematical algorithms and low counter measures to taken to prevent intrusion detection system, although most of this approaches in terms of theoretically challenged to implement. Therefore, a variety of algorithms have been committed to this challenge. Instead of generating large number of rules the evolution optimization techniques like Genetic Network Programming (GNP) can be used. The GNP is based on directed graph, In this paper the security issues related to deploy a data mining-based IDS in a real time environment is focused upon. We generalize the problem of GNP with association rule mining and propose a fuzzy weighted association rule mining with GNP framework suitable for both continuous and discrete attributes. Our proposal follows an Apriori algorithm based fuzzy WAR and GNP and avoids pre and post processing thus eliminating the extra steps during rules generation. This method can sufficient to evaluate misuse and anomaly detection. Experiments on KDD99Cup and DARPA98 data show the high detection rate and accuracy compared with other conventional method.

Journal ArticleDOI
TL;DR: Apriori algorithm was used to identify the rules (conditions) which had caused each record to be placed in each specific cluster and thereby to find a way to assess the efficiency of the maintenance system and activities.
Abstract: Maintenance has always been considered as an important part of both manufacturing and service systems and yet a costly practice. The purpose of this study is to analyze the efficiency of the maintenance activities in a maintenance system comprising of independent components, using the collected data in process. For this purpose, a three-stage method was followed. First, at the initial data preprocessing stage, after the data purification, new operating fields were defined. The data was integrated in a final matrix which was used as an input for the modeling phase. At this stage, using one of the clustering algorithms i.e. k-means, the maintenance data was clustered so that homogenous clusters of the components i.e. buses, were formed. Then using the Euclidean distance, the distances of the clusters from the ideal status were found and clusters were categorized and named accordingly. In the last part of the modeling stage, while having the clusters as target, Apriori algorithm was used to identify the rules (conditions) which had caused each record to be placed in each specific cluster and thereby to find a way to assess the efficiency of the maintenance system and activities. At the 3rd stage and on the basis of the extracted rules, necessary steps were proposed to eliminate the conditions which lead records to be placed in the clusters comprising records of bad conditions. The method is explained in a case study of the maintenance system of an urban transportation bus network.

Book ChapterDOI
27 Sep 2012
TL;DR: The KDD process is presented which includes the application of the Apriori algorithm for the association rules mining from the educational data of ESOG Web-based application.
Abstract: Many researchers have focused on the mining of educational data stored in databases of educational software and Learning Management Systems. The goal is the knowledge discovery that can help educators to support their students by managing effectively educational units, redesigning student’s activities and finally improving the learning outcome. A basic data mining technique concerns the discovery of hidden associations that exist in data stored in educational software Databases. In this paper, we present the KDD process which includes the application of the Apriori algorithm for the association rules mining from the educational data of ESOG Web-based application.

Proceedings ArticleDOI
13 Dec 2012
TL;DR: The equivalence redundancy of fuzzy items and related theorems as a new concept for fuzzy data mining is defined and a basic algorithm based on the Apriori algorithm for rule extraction is proposed utilizing the equivalences redundancy of the fuzzy items based on redundancy concepts of fuzzy association rules.
Abstract: In data mining approaches, quantitative attributes should be appropriately dealt with as well as Boolean attributes. This paper presents an essential improvement for extracting fuzzy association rules from a database. The objective of this paper is to improve the computational time of mining and to prune extracted redundant rules simultaneously for an actual data mining application. In this paper, we define the equivalence redundancy of fuzzy items and related theorems as a new concept for fuzzy data mining. Then, we propose a basic algorithm based on the Apriori algorithm for rule extraction utilizing the equivalence redundancy of the fuzzy items based on redundancy concepts of fuzzy association rules. The essential performance of the algorithm is evaluated through numerical experiments using benchmark data. From the results, the method is found to be promising in terms of computational time and redundant-rule pruning.

Proceedings ArticleDOI
Yubo Jia1, Guanghu Xia1, Hongdan Fan1, Qian Zhang1, Xu Li1 
21 Oct 2012
TL;DR: An improved algorithm based on a combination of Data Division and Dynamic Item sets Counting is proposed that can effectively improve the performance of Data Mining.
Abstract: Association Rules Mining is an important branch of Data Mining Technology, of which Apriori Algorithm is the most influential and classic one. After discussing and analyzing the basic concept of Association Rules Mining, this paper proposes an improved algorithm based on a combination of Data Division and Dynamic Item sets Counting. Analysis of the improved algorithm proves that it can effectively improve the performance of Data Mining.

Journal ArticleDOI
TL;DR: This paper focuses on map/reduce design and implementation of Apriori algorithm for structured data analysis, which stands as an elementary foundation to supervised learning, which encompasses classifier and feature extraction methods.
Abstract: Apriori is one of the key algorithms to generate frequent itemsets. Analyzing frequent itemset is a crucial step in analysing structured data and in finding association relationship between items. This stands as an elementary foundation to supervised learning, which encompasses classifier and feature extraction methods. Applying this algorithm is crucial to understand the behaviour of structured data. Most of the structured data in scientific domain are voluminous. Processing such kind of data requires state of the art computing machines. Setting up such an infrastructure is expensive. Hence a distributed environment such as a clustered setup is employed for tackling such scenarios. Apache Hadoop distribution is one of the cluster frameworks in distributed environment that helps by distributing voluminous data across a number of nodes in the framework. This paper focuses on map/reduce design and implementation of Apriori algorithm for structured data analysis.

Proceedings ArticleDOI
11 May 2012
TL;DR: The concept and the effect of association rules are introduced and the classic algorithms of association rule are analyzed, which reduces a lot of time of scanning database and shortens the computation time of the algorithm.
Abstract: Association rule mining finds interesting association or correlation relationships among a large set of data items, which is an important task of data mining. Meanwhile, Apriori is an influential algorithm for mining frequent itemsets for Boolean association rules. Firstly, the concept and the effect of association rules are introduced and the classic algorithms of association rule are analyzed. In Apriori algorithm, most time is consumed for scanning the database repeatedly. Therefore, the methods are presented about improving the Apriori algorithm efficiency, which reduces a lot of time of scanning database and shortens the computation time of the algorithm. Furthermore, several typical applications of association rules in Market-Basket Analysis are given.

Journal Article
TL;DR: The aim of this paper is to develop a very useful trend for launching products with configurations for customers of different gender based on past transactions using the previous transactions of the customers.
Abstract: The analysis of customer behavior is used to maintain good relationship with customers. It maximizes the customer satisfaction. We can also improve customer loyalty and retention. The aim of this paper is to develop a very useful trend for launching products with configurations for customers of different gender based on past transactions. Based on the previous transactions of the customers, prediction is done and data is estimated with the help of clustering and association rules. This paper proposes an effective method to extract knowledge from transactions records which is very useful for increasing the sales. Customer details are segmented using k-means and then Apriori algorithm is applied to identify customer behavior. This is followed by the identification of product associations within segments. This paper aims to develop a new trend and launch a new series of products using the previous transactions of the customers. General Terms Clustering analysis, Association rules

Proceedings ArticleDOI
01 Dec 2012
TL;DR: Improved apriori algorithm based on compressed transaction database, which is compressed based on the consequence of interest, is proposed.
Abstract: Association rule mining is used to uncover closely related item sets in transactions for deciding business policies. Apriori algorithm is widely adopted is association rule mining for generating closely related item sets. Traditional apriori algorithm is space and time consuming since it requires repeated scanning of whole transaction database. In this paper we propose improved apriori algorithm based on compressed transaction database. Transaction database is compressed based on the consequence of interest.

Journal ArticleDOI
TL;DR: An improved algorithm based on Apriori algorithm is given based on vertical data layout, breadth first searching, and intersecting to analysis car crash test in C-NCAP, and it can provide a reference for the automotive design.