Dynamic itemset counting and implication rules for market basket data

doi:10.1145/253260.253325

Proceedings ArticleDOI

Dynamic itemset counting and implication rules for market basket data

- Vol. 26, Iss: 2, pp 255-264

TLDR

A new algorithm for finding large itemsets which uses fewer passes over the data than classic algorithms, and yet uses fewer candidate itemsets than methods based on sampling and a new way of generating “implication rules” which are normalized based on both the antecedent and the consequent.

Abstract:

We consider the problem of analyzing market-basket data and present several important contributions. First, we present a new algorithm for finding large itemsets which uses fewer passes over the data than classic algorithms, and yet uses fewer candidate itemsets than methods based on sampling. We investigate the idea of item reordering, which can improve the low-level efficiency of the algorithm. Second, we present a new way of generating “implication rules,” which are normalized based on both the antecedent and the consequent and are truly implications (not simply a measure of co-occurrence), and we show how they produce more intuitive results than other methods. Finally, we show how different characteristics of real data, as opposed by synthetic data, can dramatically affect the performance of the system and the form of the results.

Citations

PDF

Open Access

More filters

Book

Data Mining: Concepts and Techniques

Jiawei Han, +2 more

TL;DR: This book presents dozens of algorithms and implementation examples, all in pseudo-code and suitable for use in real-world, large-scale data mining projects, and provides a comprehensive, practical look at the concepts and techniques you need to get the most out of real business data.

...read moreread less

Book

Data Mining: Practical Machine Learning Tools and Techniques

Ian H. Witten, +2 more

TL;DR: This highly anticipated third edition of the most acclaimed work on data mining and machine learning will teach you everything you need to know about preparing inputs, interpreting outputs, evaluating results, and the algorithmic methods at the heart of successful data mining.

...read moreread less

Proceedings ArticleDOI

Automatic subspace clustering of high dimensional data for data mining applications

Rakesh Agrawal, +3 more

TL;DR: CLIQUE is presented, a clustering algorithm that satisfies each of these requirements of data mining applications including the ability to find clusters embedded in subspaces of high dimensional data, scalability, end-user comprehensibility of the results, non-presumption of any canonical data distribution, and insensitivity to the order of input records.

...read moreread less

Journal ArticleDOI

Scalable algorithms for association mining

Mohammed J. Zaki

- 01 May 2000 -

IEEE Transactions on Knowledge and Data ...

TL;DR: Efficient algorithms for the discovery of frequent itemsets which forms the compute intensive phase of the association mining task are presented and the effect of using different database layout schemes combined with the proposed decomposition and traverse techniques are presented.

...read moreread less

Book ChapterDOI

Discovering Frequent Closed Itemsets for Association Rules

Nicolas Pasquier, +3 more

TL;DR: This paper proposes a new algorithm, called A-Close, using a closure mechanism to find frequent closed itemsets, and shows that this approach is very valuable for dense and/or correlated data that represent an important part of existing databases.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

Mining association rules between sets of items in large databases

Rakesh Agrawal, +2 more

TL;DR: An efficient algorithm is presented that generates all significant association rules between items in the database of customer transactions and incorporates buffer management and novel estimation and pruning techniques.

...read moreread less

Proceedings Article

Fast algorithms for mining association rules

Rakesh Agrawal, +1 more

TL;DR: Two new algorithms for solving thii problem that are fundamentally different from the known algorithms are presented and empirical evaluation shows that these algorithms outperform theknown algorithms by factors ranging from three for small problems to more than an order of magnitude for large problems.

...read moreread less

Proceedings Article

Fast Algorithms for Mining Association Rules in Large Databases

Rakesh Agrawal, +1 more

Proceedings ArticleDOI

Mining sequential patterns

Rakesh Agrawal, +1 more

TL;DR: Three algorithms are presented to solve the problem of mining sequential patterns over databases of customer transactions, and empirically evaluating their performance using synthetic data shows that two of them have comparable performance.

...read moreread less

Journal ArticleDOI

Mining generalized association rules

Ramakrishnan Srikant, +1 more

- 01 Nov 1997 -

Future Generation Computer Systems

TL;DR: A new interest-measure for rules which uses the information in the taxonomy is presented, and given a user-specified “minimum-interest-level”, this measure prunes a large number of redundant rules.

...read moreread less

Dynamic itemset counting and implication rules for market basket data

Citations

Data Mining: Concepts and Techniques

Data Mining: Practical Machine Learning Tools and Techniques

Automatic subspace clustering of high dimensional data for data mining applications

Scalable algorithms for association mining

Discovering Frequent Closed Itemsets for Association Rules

References

Mining association rules between sets of items in large databases

Fast algorithms for mining association rules

Fast Algorithms for Mining Association Rules in Large Databases

Mining sequential patterns

Mining generalized association rules

Related Papers (5)

Mining association rules between sets of items in large databases

Fast algorithms for mining association rules

Fast Algorithms for Mining Association Rules in Large Databases

Mining frequent patterns without candidate generation

An Efficient Algorithm for Mining Association Rules in Large Databases