A Privacy-Preserving Mining Algorithm of Association Rules in Distributed Databases

doi:10.1109/IMSCCS.2006.164

Home
/
Papers
/
A Privacy-Preserving Mining Algorithm of Association Rules in Distributed Databases

Proceedings Article•DOI•

A Privacy-Preserving Mining Algorithm of Association Rules in Distributed Databases

Jie Liu¹, Xiufeng Piao¹, Shaobin Huang¹•Institutions (1)

Harbin Engineering University¹

20 Apr 2006-Vol. 2, pp 746-750

TL;DR: A secure mining algorithm of association rules, which builds a globe hash table to prune item-sets and incorporate cryptographic techniques to minimize the information shared is addressed.

read less

Abstract: Association rules mining is one of the most important and fundamental problems in data mining. Recently, in need of security, more and more people are studying privacy- preserving association rules mining in distributed database. This paper addresses a secure mining algorithm of association rules, which builds a globe hash table to prune item- sets and incorporate cryptographic techniques to minimize the information shared.

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Privacy-preserving distributed mining of association rules using Elliptic-curve cryptosystem and Shamir’s secret sharing scheme

[...]

Harendra Chahar¹, Bettahally N. Keshavamurthy¹, Chirag Modi¹•Institutions (1)

National Institute of Technology Goa¹

17 Nov 2017-Sadhana-academy Proceedings in Engineering Sciences

TL;DR: This work proposes two protocols, which are securely generating global association rules in horizontally distributed databases, and incorporates Shamir’s secret sharing scheme in the second protocol, which provides privacy by preventing colluding sites and external adversary attack.

...read moreread less

Abstract: Distributed data mining has played a vital role in numerous application domains. However, it is widely observed that data mining may pose a privacy threat to individual’s sensitive information. To address privacy problem in distributed association rule mining (a data mining technique), we propose two protocols, which are securely generating global association rules in horizontally distributed databases. The first protocol uses the notion of Elliptic-curve-based Paillier cryptosystem, which helps in achieving the integrity and authenticity of the messages exchanged among involving sites over the insecure communication channel. It offers privacy of individual site’s information against the involving sites and an external adversary. However, the collusion of two sites may affect the privacy of individuals. To address this problem, we incorporate Shamir’s secret sharing scheme in the second protocol. It provides privacy by preventing colluding sites and external adversary attack. We analyse both protocols in terms of fulfilling the privacy-preserving distributed association rule mining requirements.

...read moreread less

14 citations

Cites background from "A Privacy-Preserving Mining Algorit..."

...Liu et al [21] have designed a privacypreserving association rules mining algorithm in distributed environment, wherein a global hash table is built to prune candidate itemsets in early iteration of mining operation, which increases the efficiency of algorithm....
[...]

Posted Content•

An Improved Approach to High Level Privacy Preserving Itemset Mining

[...]

Rajesh Kumar Boora, Ruchi Shukla, Arun Kumar Misra

13 Jan 2010-arXiv: Databases

TL;DR: A new transactionrandomization method is proposed which is a combination of the fake transaction randomization method and a new per transaction randomized method which ensures a higher level of data privacy in comparison to the previous approaches.

...read moreread less

Abstract: Privacy preserving association rule mining has triggered the development of many privacy preserving data mining techniques. A large fraction of them use randomized data distortion techniques to mask the data for preserving. This paper proposes a new transaction randomization method which is a combination of the fake transaction randomization method and a new per transaction randomization method. This method distorts the items within each transaction and ensures a higher level of data privacy in comparison to the previous approaches. The pertransaction randomization method involves a randomization function to replace the item by a random number guarantying privacy within the transaction also. A tool has also been developed to implement the proposed approach to mine frequent itemsets and association rules from the data guaranteeing the antimonotonic property.

...read moreread less

12 citations

Additional excerpts

...An algorithm for privacy preserving mining of association rules in distributed databases that builds a global hashing table Hi in every iteration, is proposed by Liu [18]....
[...]
...Figure 7 shows the comparison between the new approach and the approach of Lin and Liu....
[...]

Proceedings Article•DOI•

Privacy preserving association rules in unsecured distributed environment using cryptography

[...]

A. C. Patel¹, Udai Pratap Rao¹, Dhiren Patel¹•Institutions (1)

Sardar Vallabhbhai National Institute of Technology, Surat¹

26 Jul 2012

TL;DR: This paper proposes algorithm to mine association rule using elliptic curve cryptography technique over horizontally partitioned data and provides security against involving parties and intruder and also provides authentication between involving parties.

...read moreread less

Abstract: In this paper, we propose algorithm to mine association rule using elliptic curve cryptography technique over horizontally partitioned data. Here we consider unsecured distributed environment. Our proposed algorithm provides security against involving parties and intruder and also provides authentication between involving parties. Finally we analyze the privacy and security provided by our proposed algorithm.

...read moreread less

10 citations

Cites background from "A Privacy-Preserving Mining Algorit..."

...[5] In term of association rule mining, rule XY � Z will be discovered, as long as satisfy these conditions: Is!es Support Count XYZ (i) 1....
[...]

Book Chapter•DOI•

Privacy Preserving Association Rule Mining in Horizontally Partitioned Databases Without Involving Trusted Third Party (TTP)

[...]

Chirag Modi¹, Ashwini R. Patil•Institutions (1)

National Institute of Technology Goa¹

01 Jan 2016

TL;DR: A security protocol that derives association rules securely from the horizontally distributed databases without a trusted third party (TTP) and ensures privacy and security of owner’s with the help of elliptic curve based Diffie-Hellman and Digital Signature Algorithm.

...read moreread less

Abstract: In this research work, we design a security protocol that derives association rules securely from the horizontally distributed databases without a trusted third party (TTP), even communication channel is unsecured between involving sites It ensures privacy and security of owner’s with the help of elliptic curve based Diffie-Hellman and Digital Signature Algorithm

...read moreread less

6 citations

DOI•

Elliptic Curve Cryptography Based Mining of Privacy Preserving Association Rules in Unsecured Distributed Environment

[...]

Chirag Modi, Udai Pratap Rao, Dhiren Patel

04 Oct 2010

TL;DR: An elliptic curve cryptography based algorithm is proposed to mine privacy-preserving association rules on horizontal partitioned data and provides privacy and security against involving parties and other parties (adversaries) who can reveal information by reading unsecured channel between involving parties.

...read moreread less

Abstract: Distributed data mining techniques are often used for various applications. In terms of privacy and security issues, these techniques are recently investigated with a conclusion that they reveal data or information to each other parties involved to find global valid results. But because of privacy issues, involving parties do not want to reveal such type of data. Recently many cryptography techniques have been found to address privacy problems in distributed mining. In this paper, we propose an elliptic curve cryptography based algorithm to mine privacy-preserving association rules on horizontal partitioned data. Moreover, we have also considered unsecured communication channels in distributed environment. Proposed algorithm provides privacy and security against involving parties and other parties (adversaries) who can reveal information by reading unsecured channel between involving parties. Finally, we analyze the privacy and security provided by proposed algorithm and also discuss the communication and computation cost of proposed algorithm. Keywords-Data Mining; Association Rules; Distributed Databases; Privacy; Security; Eliptic Curve Cryptography;

...read moreread less

5 citations

Cites background from "A Privacy-Preserving Mining Algorit..."

...[15] proposed an efficient and secure algorithm for mining privacy preserved association rules in distributed environment, which uses hashing of candidate itemsets and filter out the itemset whose support exceeds the minimum thresholds....
[...]

References

PDF

Open Access

More filters

Proceedings Article•DOI•

Mining association rules between sets of items in large databases

[...]

Rakesh Agrawal¹, Tomasz Imielinski², Arun N. Swami¹•Institutions (2)

IBM¹, Rutgers University²

01 Jun 1993

TL;DR: An efficient algorithm is presented that generates all significant association rules between items in the database of customer transactions and incorporates buffer management and novel estimation and pruning techniques.

...read moreread less

Abstract: We are given a large database of customer transactions. Each transaction consists of items purchased by a customer in a visit. We present an efficient algorithm that generates all significant association rules between items in the database. The algorithm incorporates buffer management and novel estimation and pruning techniques. We also present results of applying this algorithm to sales data obtained from a large retailing company, which shows the effectiveness of the algorithm.

...read moreread less

15,645 citations

"A Privacy-Preserving Mining Algorit..." refers methods in this paper

...CD [10] algorithm, which is an adaptation of the Apriori’s, has a simple communication scheme for count exchange, but it has problems of higher number of candidate sets and larger amount of communication....
[...]

Proceedings Article•

Fast algorithms for mining association rules

[...]

Rakesh Agrawal, Ramakrishnan Srikant

01 Jul 1998

TL;DR: Two new algorithms for solving thii problem that are fundamentally different from the known algorithms are presented and empirical evaluation shows that these algorithms outperform theknown algorithms by factors ranging from three for small problems to more than an order of magnitude for large problems.

...read moreread less

Abstract: We consider the problem of discovering association rules between items in a large database of sales transactions. We present two new algorithms for solving thii problem that are fundamentally different from the known algorithms. Empirical evaluation shows that these algorithms outperform the known algorithms by factors ranging from three for small problems to more than an order of magnitude for large problems. We also show how the best features of the two proposed algorithms can be combined into a hybrid algorithm, called AprioriHybrid. Scale-up experiments show that AprioriHybrid scales linearly with the number of transactions. AprioriHybrid also has excellent scale-up properties with respect to the transaction size and the number of items in the database.

...read moreread less

10,863 citations

Journal Article•DOI•

Mining association rules between sets of items in large databases

[...]

AgrawalRakesh, ImielińskiTomasz, SwamiArun

01 Jun 1993-Sigmod Record

TL;DR: An efficient algorithm is presented that generates all significant transactions in a large database of customer transactions that consists of items purchased by a customer in a visit.

...read moreread less

3,198 citations

Journal Article•DOI•

Data mining: an overview from a database perspective

[...]

Ming-Syan Chen¹, Jiawei Han², Philip S. Yu³•Institutions (3)

National Taiwan University¹, Simon Fraser University², IBM³

01 Dec 1996-IEEE Transactions on Knowledge and Data Engineering

TL;DR: In this paper, a survey of the available data mining techniques is provided and a comparative study of such techniques is presented, based on a database researcher's point-of-view.

...read moreread less

Abstract: Mining information and knowledge from large databases has been recognized by many researchers as a key research topic in database systems and machine learning, and by many industrial companies as an important area with an opportunity of major revenues. Researchers in many different fields have shown great interest in data mining. Several emerging applications in information-providing services, such as data warehousing and online services over the Internet, also call for various data mining techniques to better understand user behavior, to improve the service provided and to increase business opportunities. In response to such a demand, this article provides a survey, from a database researcher's point of view, on the data mining techniques developed recently. A classification of the available data mining techniques is provided and a comparative study of such techniques is presented.

...read moreread less

2,327 citations

Proceedings Article•

An Efficient Algorithm for Mining Association Rules in Large Databases

[...]

Ashoka Savasere, Edward Omiecinski, Shamkant B. Navathe

11 Sep 1995

TL;DR: This paper presents an efficient algorithm for mining association rules that is fundamentally different from known algorithms and not only reduces the I/O overhead significantly but also has lower CPU overhead for most cases.

...read moreread less

Abstract: Mining for a.ssociation rules between items in a large database of sales transactions has been described as an important database mining problem. In this paper we present an efficient algorithm for mining association rules that is fundamentally different from known algorithms. Compared to previous algorithms, our algorithm not only reduces the I/O overhead significantly but also has lower CPU overhead for most cases. We have performed extensive experiments and compared the performance of our algorithm with one of the best existing algorithms. It was found that for large databases, the CPU overhead was reduced by as much as a factor of four and I/O was reduced by almost an order of magnitude. Hence this algorithm is especially suitable for very large size databases.

...read moreread less

1,822 citations