scispace - formally typeset
Journal ArticleDOI

Efficient Algorithms for Mining High Utility Itemsets from Transactional Databases

TLDR
Experimental results show that the proposed algorithms, especially UP-Growth+, not only reduce the number of candidates effectively but also outperform other algorithms substantially in terms of runtime, especially when databases contain lots of long transactions.
Abstract
Mining high utility itemsets from a transactional database refers to the discovery of itemsets with high utility like profits. Although a number of relevant algorithms have been proposed in recent years, they incur the problem of producing a large number of candidate itemsets for high utility itemsets. Such a large number of candidate itemsets degrades the mining performance in terms of execution time and space requirement. The situation may become worse when the database contains lots of long transactions or long high utility itemsets. In this paper, we propose two algorithms, namely utility pattern growth (UP-Growth) and UP-Growth+, for mining high utility itemsets with a set of effective strategies for pruning candidate itemsets. The information of high utility itemsets is maintained in a tree-based data structure named utility pattern tree (UP-Tree) such that candidate itemsets can be generated efficiently with only two scans of database. The performance of UP-Growth and UP-Growth+ is compared with the state-of-the-art algorithms on many types of both real and synthetic data sets. Experimental results show that the proposed algorithms, especially UP-Growth+, not only reduce the number of candidates effectively but also outperform other algorithms substantially in terms of runtime, especially when databases contain lots of long transactions.

read more

Citations
More filters
Proceedings ArticleDOI

Mining high utility itemsets without candidate generation

TL;DR: This paper proposes an algorithm, called HUI-Miner (High Utility Itemset Miner), which can efficiently mine high utility itemsets from the utility-lists constructed from a mined database and compares it with the state-of-the-art algorithms on various databases.
Book ChapterDOI

FHM: Faster High-Utility Itemset Mining using Estimated Utility Co-occurrence Pruning

TL;DR: An extensive experimental study with four real-life datasets shows that the resulting algorithm named FHM (Fast High-Utility Miner) reduces the number of join operations by up to 95 % and is up to six times faster than the state-of-the-art algorithm HUI-Miner.
Journal ArticleDOI

Pruning strategies for mining high utility itemsets

TL;DR: Experimental results reveal that the proposed method is very effective in pruning unpromising candidates, especially for sparse transactional databases, and a comparative evaluation against a state-of-the-art utility mining method is presented.
Journal ArticleDOI

EFIM: a fast and memory efficient algorithm for high-utility itemset mining

TL;DR: A novel algorithm named EFIM (EFficient high-utility Itemset Mining), which introduces several new ideas to more efficiently discover high-UTility itemsets and is in general two to three orders of magnitude faster than the state-of-art algorithms.
Journal ArticleDOI

A survey of incremental high-utility itemset mining

TL;DR: This paper provides an up‐to‐date survey of the state‐of‐the‐art iHUIM algorithms, including Apriori‐based, tree‐ based, and utility‐list‐based approaches, and identifies several important issues and research challenges for iH UIM.
References
More filters
Proceedings Article

Fast algorithms for mining association rules

TL;DR: Two new algorithms for solving thii problem that are fundamentally different from the known algorithms are presented and empirical evaluation shows that these algorithms outperform theknown algorithms by factors ranging from three for small problems to more than an order of magnitude for large problems.
Journal ArticleDOI

Mining frequent patterns without candidate generation

TL;DR: This study proposes a novel frequent pattern tree (FP-tree) structure, which is an extended prefix-tree structure for storing compressed, crucial information about frequent patterns, and develops an efficient FP-tree-based mining method, FP-growth, for mining the complete set of frequent patterns by pattern fragment growth.
Proceedings ArticleDOI

Mining sequential patterns

TL;DR: Three algorithms are presented to solve the problem of mining sequential patterns over databases of customer transactions, and empirically evaluating their performance using synthetic data shows that two of them have comparable performance.
Journal ArticleDOI

Scalable algorithms for association mining

TL;DR: Efficient algorithms for the discovery of frequent itemsets which forms the compute intensive phase of the association mining task are presented and the effect of using different database layout schemes combined with the proposed decomposition and traverse techniques are presented.
Journal ArticleDOI

Mining sequential patterns by pattern-growth: the PrefixSpan approach

TL;DR: This paper proposes a projection-based, sequential pattern-growth approach for efficient mining of sequential patterns, and shows that PrefixSpan outperforms the a priori-based algorithm GSP, FreeSpan, and SPADE and is the fastest among all the tested algorithms.
Related Papers (5)