Efficiently mining long patterns from databases

doi:10.1145/276304.276313

Open AccessProceedings ArticleDOI

Efficiently mining long patterns from databases

Roberto J. Bayardo

- Vol. 27, Iss: 2, pp 85-93

Chats0

TLDR

A pattern-mining algorithm that scales roughly linearly in the number of maximal patterns embedded in a database irrespective of the length of the longest pattern, compared with previous algorithms that scale exponentially with longest pattern length.

Abstract:

We present a pattern-mining algorithm that scales roughly linearly in the number of maximal patterns embedded in a database irrespective of the length of the longest pattern. In comparison, previous algorithms based on Apriori scale exponentially with longest pattern length. Experiments on real data show that when the patterns are long, our algorithm is more efficient by an order of magnitude or more.

Citations

PDF

Open Access

More filters

Book

Data Mining: Concepts and Techniques

Jiawei Han, +2 more

TL;DR: This book presents dozens of algorithms and implementation examples, all in pseudo-code and suitable for use in real-world, large-scale data mining projects, and provides a comprehensive, practical look at the concepts and techniques you need to get the most out of real business data.

...read moreread less

Journal ArticleDOI

Mining frequent patterns without candidate generation

Jiawei Han, +2 more

TL;DR: This study proposes a novel frequent pattern tree (FP-tree) structure, which is an extended prefix-tree structure for storing compressed, crucial information about frequent patterns, and develops an efficient FP-tree-based mining method, FP-growth, for mining the complete set of frequent patterns by pattern fragment growth.

...read moreread less

Proceedings ArticleDOI

Automatic subspace clustering of high dimensional data for data mining applications

Rakesh Agrawal, +3 more

TL;DR: CLIQUE is presented, a clustering algorithm that satisfies each of these requirements of data mining applications including the ability to find clusters embedded in subspaces of high dimensional data, scalability, end-user comprehensibility of the results, non-presumption of any canonical data distribution, and insensitivity to the order of input records.

...read moreread less

Data Mining: Concepts and Techniques (2nd edition)

Jiawei Han, +1 more

TL;DR: There have been many data mining books published in recent years, including Predictive Data Mining by Weiss and Indurkhya [WI98], Data Mining Solutions: Methods and Tools for Solving Real-World Problems by Westphal and Blaxton [WB98], Mastering Data Mining: The Art and Science of Customer Relationship Management by Berry and Linofi [BL99].

...read moreread less

Journal ArticleDOI

Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach

Jiawei Han, +3 more

- 01 Jan 2004 -

Data Mining and Knowledge Discovery

TL;DR: A novel frequent-pattern tree (FP-tree) structure is proposed, which is an extended prefix-tree structure for storing compressed, crucial information about frequent patterns, and an efficient FP-tree-based mining method, FP-growth, is developed for mining the complete set of frequent patterns by pattern fragment growth.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

Mining association rules between sets of items in large databases

Rakesh Agrawal, +2 more

TL;DR: An efficient algorithm is presented that generates all significant association rules between items in the database of customer transactions and incorporates buffer management and novel estimation and pruning techniques.

...read moreread less

Proceedings Article

Fast algorithms for mining association rules

Rakesh Agrawal, +1 more

TL;DR: Two new algorithms for solving thii problem that are fundamentally different from the known algorithms are presented and empirical evaluation shows that these algorithms outperform theknown algorithms by factors ranging from three for small problems to more than an order of magnitude for large problems.

...read moreread less

Proceedings ArticleDOI

Mining sequential patterns

Rakesh Agrawal, +1 more

TL;DR: Three algorithms are presented to solve the problem of mining sequential patterns over databases of customer transactions, and empirically evaluating their performance using synthetic data shows that two of them have comparable performance.

...read moreread less

Book ChapterDOI

Mining Sequential Patterns: Generalizations and Performance Improvements

Ramakrishnan Srikant, +2 more

TL;DR: This work adds time constraints that specify a minimum and/or maximum time period between adjacent elements in a pattern, and relax the restriction that the items in an element of a sequential pattern must come from the same transaction.

...read moreread less

Proceedings Article

Fast discovery of association rules

Rakesh Agrawal, +4 more

Efficiently mining long patterns from databases

Citations

Data Mining: Concepts and Techniques

Mining frequent patterns without candidate generation

Automatic subspace clustering of high dimensional data for data mining applications

Data Mining: Concepts and Techniques (2nd edition)

Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach

References

Mining association rules between sets of items in large databases

Fast algorithms for mining association rules

Mining sequential patterns

Mining Sequential Patterns: Generalizations and Performance Improvements

Fast discovery of association rules

Related Papers (5)

Mining association rules between sets of items in large databases

Mining frequent patterns without candidate generation

Fast algorithms for mining association rules

Fast Algorithms for Mining Association Rules in Large Databases

Dynamic itemset counting and implication rules for market basket data