scispace - formally typeset
Search or ask a question
Topic

Apriori algorithm

About: Apriori algorithm is a research topic. Over the lifetime, 4105 publications have been published within this topic receiving 85965 citations.


Papers
More filters
Book ChapterDOI
05 Mar 2020
TL;DR: The improved Apriori algorithm can be applied in the analysis on prescription compatibility and find out high-frequency herbs and herbs combinations with low storage consumption and high efficiency, which can provide references for clinical use of herbs which reveal the compatibility rule of the classical prescription.
Abstract: Objective: To analyze the traditional Chinese medicine (TCM) prescription regularity of Treatise on Febrile Diseases by using improved Apriori algorithm to obtain more efficient data mining. Methods: 113 formulae from Treatise on Febrile Diseases were collected and terms of herbs in this prescription were standardized. This paper put forward valid value index storage, and fast intersection operation to improve the efficiency of mining TCM data. The support-confidence-lift framework is adopted to evaluate the effectiveness of the rules and avoid the generation of meaningless rules. Results: 18 high-frequency herbs with occurrence of 10 times or above, including Licorice, Cassia Twig, Jujube and Ginseng, etc. Among18 high-frequency herbs, 52 combinations are obtained classical traditional herb pairs, such as Licorice-Cassia Twig, Jujube-Ginger and Ginger-Licorice, etc. Conclusion: The improved Apriori algorithm can be applied in the analysis on prescription compatibility and find out high-frequency herbs and herbs combinations with low storage consumption and high efficiency. The experimental result can provide references for clinical use of herbs which reveal the compatibility rule of the classical prescription.
Journal ArticleDOI
01 Jul 2017
TL;DR: The results show that the relationship between day 1 and day 16 is the most closely related, which can guide the staff to analyze the work of these two days of motor to find and solve the problem of fault and safety.
Abstract: Shore-hoisting motor in the daily work will produce a large number of vibration signal data,in order to analyze the correlation among the data and discover the fault and potential safety hazard of the motor, the data are discretized first, and then Apriori algorithm are used to mine the strong association rules among the data. The results show that the relationship between day 1 and day 16 is the most closely related, which can guide the staff to analyze the work of these two days of motor to find and solve the problem of fault and safety.
Proceedings ArticleDOI
11 Jul 2007
TL;DR: The newly proposed form of Apriori can be effectively applied to more situations in which its primitive form does not work and is successfully applied to the problem of clustering security events organized in a hierarchical manner.
Abstract: Due to its excellent performance, Apriori is frequently adopted to discover frequent itemsets, from which strong association rules can be easily generated, from among massive amounts of transactional or relational data. In this paper, Apriori is reconsidered with a more abstract perspective of data space, and a more general form of the algorithm is proposed. As is shown in this paper, the newly proposed form of the algorithm can be effectively applied to more situations in which its primitive form does not work. The more general form of Apriori is successfully applied to the problem of clustering security events organized in a hierarchical manner, which illustrates its usefulness.
Book ChapterDOI
18 Aug 2017
TL;DR: A new improved version of Apriori is proposed in this paper which is efficient in terms of time required as well as number of database scans than the A Priori algorithm.
Abstract: Obtaining frequent itemsets from the dataset is one of the most promising areas of data mining. The Apriori algorithm is one of the most important algorithms for obtaining frequent itemsets from the dataset. But the algorithm fails in terms of time required as well as number of database scans. Hence a new improved version of Apriori is proposed in this paper which is efficient in terms of time required as well as number of database scans than the Apriori algorithm. It is well known that the size of the database for defining candidates has great effect on running time and memory need. The usefulness of the adaptive apriori algorithm in terms of dimensionality of the dataset is demonstrated. We presented experimental results, showing that the proposed algorithm always outperform Apriori. To evaluate the performance of the proposed algorithm, we have tested it on Turkey student’s database of faculty evaluations.
Journal ArticleDOI
TL;DR: A new way which is more efficient in time and space frequent itemset mining is introduced, which will reduced the complexity (time & memory) of frequent pattern mining.
Abstract: Frequent/Periodic item set mining is a extensively used data mining method for market based analysis,privacy preserving and it is also a heart favourite theme for the resarchers. A substantial work has been devoted to this research and tremendous progression made in this field so far. Frequent/Periodic itemset mining is used for search and to find back the relationship in a given data set. This paper introduces a new way which is more efficient in time and space frequent itemset mining. Our method scans the database only one time whereas the previous algorithms scans the database many times which utilizes more time and memory related to new one. In this way,the new algorithm will reduced the complexity (time & memory) of frequent pattern mining. We present efficient techniques to implement the new approach. Keywords: Incremental Association Rule Mining, Minimum Support Threshold(MST),Transactional Data set. I. Introduction Data mining is the process of discovering and analyzing useful data from a large data set. The goal of the data mining process is to extract the useful information from a data set and transform it into an understandable structure for further use. It allows the user to analyze the data from various dimensions, categorize it and summarize the relationships identify. Data mining has emerged in various areas such as Customer relationship management (identify those who are likely to leave for a competitor), Banking (loan/credit card approval predict good customers based on old customers), Targeted marketing (identify likely responders to promotions), Fraud detection (telecommunications, financial transactions) etc. Data mining is the key part of Knowledge Discovery in Database (KDD)(1) (4) process. Data selection, data cleaning, data transformation, Data mining, finding presentation, finding interpretation, and finding evaluation are the steps involve in KDD process. There are different kinds of method and techniques for data mining. Tasks in data mining can be classified as Summarization (relevant data is summarized and abstracted, resulting a smaller set which gives a overview of a data and usually with complete information) , Classification ( it determines the class of an object based on its attributes), Clustering (identification of classes), Trend analysis, Regression and Deviation (Predictive mining), Association Rule Discovery(1) (2), Sequential Pattern Discovery (Descriptive mining). Data mining adopted its techniques from various research areas, including Statistical approach ( Bayesian network), Machine learning, database systems, neural networks, rough sets, and visualizations. Predictive mining is the technique which is used to predict the unknown variables or future values of other variable and Descriptive mining is technique which is used to find the human-interpretable patterns that describes the data. One of the major technique in data mining are Association rules. The most important task in association rule mining is to find the frequent/periodic patterns, associations, correlations, or casual structures among sets of items or objects in transaction or relational databases, and other information repositories (13). In a given set of transactions, where transaction consists of items such as P and R then association rules are denoted as P=>R and intersection between them is null. The association rule can be useful for commodity management, marketing, etc. The support of this rule is defined by percentage of transaction that contains set P. And the Confidence of this rule is defined as percentage of these P transactions that also contain R. In Association rule mining, Frequent item set is an item set whose support is greater than the Minimum Support Threshold (MST). Minimum support threshold is a user defined support which is used to generate frequent items. Previously algorithms which are used to discover frequent patterns are static in nature. These algorithms are not able to work efficiently whenever any change happens to original database as in real world data is growing continuously. One solution of this algorithm is to reapply the algorithm on new database but in this case CPU utilization and time is very high and this approach is costly whenever small amount of data is inserted. Efficiency of these algorithms is based on number of passes as well as scans required for processing. A new algorithm was introduced to discover frequent items whenever new data is added dynamically to the original database. This algorithm was based on Generate and Test Method. In this method all possible candidates are generated and then tested for minimum support threshold (MST).

Network Information
Related Topics (5)
Fuzzy logic
151.2K papers, 2.3M citations
84% related
Cluster analysis
146.5K papers, 2.9M citations
83% related
Support vector machine
73.6K papers, 1.7M citations
83% related
Software
130.5K papers, 2M citations
80% related
Feature extraction
111.8K papers, 2.1M citations
78% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202392
2022291
2021180
2020216
2019209
2018223