Efficient Algorithms for Mining High Utility Itemsets from Transactional Databases

doi:10.1109/TKDE.2012.59

Journal ArticleDOI

Efficient Algorithms for Mining High Utility Itemsets from Transactional Databases

Vincent S. Tseng, +3 more

- 01 Aug 2013 -

IEEE Transactions on Knowledge and Data ...

- Vol. 25, Iss: 8, pp 1772-1786

TLDR

Experimental results show that the proposed algorithms, especially UP-Growth+, not only reduce the number of candidates effectively but also outperform other algorithms substantially in terms of runtime, especially when databases contain lots of long transactions.

Abstract:

Mining high utility itemsets from a transactional database refers to the discovery of itemsets with high utility like profits. Although a number of relevant algorithms have been proposed in recent years, they incur the problem of producing a large number of candidate itemsets for high utility itemsets. Such a large number of candidate itemsets degrades the mining performance in terms of execution time and space requirement. The situation may become worse when the database contains lots of long transactions or long high utility itemsets. In this paper, we propose two algorithms, namely utility pattern growth (UP-Growth) and UP-Growth+, for mining high utility itemsets with a set of effective strategies for pruning candidate itemsets. The information of high utility itemsets is maintained in a tree-based data structure named utility pattern tree (UP-Tree) such that candidate itemsets can be generated efficiently with only two scans of database. The performance of UP-Growth and UP-Growth+ is compared with the state-of-the-art algorithms on many types of both real and synthetic data sets. Experimental results show that the proposed algorithms, especially UP-Growth+, not only reduce the number of candidates effectively but also outperform other algorithms substantially in terms of runtime, especially when databases contain lots of long transactions.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

Mining high utility itemsets without candidate generation

Mengchi Liu, +1 more

TL;DR: This paper proposes an algorithm, called HUI-Miner (High Utility Itemset Miner), which can efficiently mine high utility itemsets from the utility-lists constructed from a mined database and compares it with the state-of-the-art algorithms on various databases.

...read moreread less

Book ChapterDOI

FHM: Faster High-Utility Itemset Mining using Estimated Utility Co-occurrence Pruning

Philippe Fournier-Viger, +3 more

TL;DR: An extensive experimental study with four real-life datasets shows that the resulting algorithm named FHM (Fast High-Utility Miner) reduces the number of join operations by up to 95 % and is up to six times faster than the state-of-the-art algorithm HUI-Miner.

...read moreread less

Journal ArticleDOI

Pruning strategies for mining high utility itemsets

Srikumar Krishnamoorthy

- 01 Apr 2015 -

Expert Systems With Applications

TL;DR: Experimental results reveal that the proposed method is very effective in pruning unpromising candidates, especially for sparse transactional databases, and a comparative evaluation against a state-of-the-art utility mining method is presented.

...read moreread less

Journal ArticleDOI

EFIM: a fast and memory efficient algorithm for high-utility itemset mining

Souleymane Zida, +4 more

- 01 May 2017 -

Knowledge and Information Systems

TL;DR: A novel algorithm named EFIM (EFficient high-utility Itemset Mining), which introduces several new ideas to more efficiently discover high-UTility itemsets and is in general two to three orders of magnitude faster than the state-of-art algorithms.

...read moreread less

Journal ArticleDOI

A survey of incremental high-utility itemset mining

Wensheng Gan, +5 more

- 01 Mar 2018 -

Wiley Interdisciplinary Reviews-Data Min...

TL;DR: This paper provides an up‐to‐date survey of the state‐of‐the‐art iHUIM algorithms, including Apriori‐based, tree‐ based, and utility‐list‐based approaches, and identifies several important issues and research challenges for iH UIM.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings Article

Fast algorithms for mining association rules

Rakesh Agrawal, +1 more

TL;DR: Two new algorithms for solving thii problem that are fundamentally different from the known algorithms are presented and empirical evaluation shows that these algorithms outperform theknown algorithms by factors ranging from three for small problems to more than an order of magnitude for large problems.

...read moreread less

Journal ArticleDOI

Mining frequent patterns without candidate generation

Jiawei Han, +2 more

TL;DR: This study proposes a novel frequent pattern tree (FP-tree) structure, which is an extended prefix-tree structure for storing compressed, crucial information about frequent patterns, and develops an efficient FP-tree-based mining method, FP-growth, for mining the complete set of frequent patterns by pattern fragment growth.

...read moreread less

Proceedings ArticleDOI

Mining sequential patterns

Rakesh Agrawal, +1 more

TL;DR: Three algorithms are presented to solve the problem of mining sequential patterns over databases of customer transactions, and empirically evaluating their performance using synthetic data shows that two of them have comparable performance.

...read moreread less

Journal ArticleDOI

Scalable algorithms for association mining

Mohammed J. Zaki

- 01 May 2000 -

IEEE Transactions on Knowledge and Data ...

TL;DR: Efficient algorithms for the discovery of frequent itemsets which forms the compute intensive phase of the association mining task are presented and the effect of using different database layout schemes combined with the proposed decomposition and traverse techniques are presented.

...read moreread less

Journal ArticleDOI

Mining sequential patterns by pattern-growth: the PrefixSpan approach

Jian Pei, +7 more

- 01 Nov 2004 -

IEEE Transactions on Knowledge and Data ...

TL;DR: This paper proposes a projection-based, sequential pattern-growth approach for efficient mining of sequential patterns, and shows that PrefixSpan outperforms the a priori-based algorithm GSP, FreeSpan, and SPADE and is the fastest among all the tested algorithms.

...read moreread less

Collapse

Efficient Algorithms for Mining High Utility Itemsets from Transactional Databases

Citations

Mining high utility itemsets without candidate generation

FHM: Faster High-Utility Itemset Mining using Estimated Utility Co-occurrence Pruning

Pruning strategies for mining high utility itemsets

EFIM: a fast and memory efficient algorithm for high-utility itemset mining

A survey of incremental high-utility itemset mining

References

Fast algorithms for mining association rules

Mining frequent patterns without candidate generation

Mining sequential patterns

Scalable algorithms for association mining

Mining sequential patterns by pattern-growth: the PrefixSpan approach

Related Papers (5)

Mining high utility itemsets without candidate generation

Efficient Tree Structures for High Utility Pattern Mining in Incremental Databases

A two-phase algorithm for fast discovery of high utility itemsets

Fast Algorithms for Mining Association Rules in Large Databases

Mining high utility itemsets